به جمع مشترکان مگیران بپیوندید!

تنها با پرداخت 70 هزارتومان حق اشتراک سالانه به متن مقالات دسترسی داشته باشید و 100 مقاله را بدون هزینه دیگری دریافت کنید.

برای پرداخت حق اشتراک اگر عضو هستید وارد شوید در غیر این صورت حساب کاربری جدید ایجاد کنید

عضویت
فهرست مطالب نویسنده:

babak teimourpour

  • جعفر پهلوانی، مجید شیخ محمدی*، بابک تیمورپور
    بازرسی از واحدهای اقتصادی یکی از بازوهای تنظیم بازار جهت برقراری آرامش و ثبات در آن است. کارایی فرایند بازرسی به کاهش جرائم، افزایش اعتماد عمومی به حاکمیت و احقاق حقوق مردم منجر خواهد شد. از این رو بررسی صحت عملکرد بازرسان به کمک داده های شکل گرفته از بازرسی های آنان در سامانه یکپارچه مدیریت بازرسی کشور (سیمبا) و کشف رفتارهای متقلبانه و آسیب زا در بازرسی ها همچون بازرسی های صوری، نقشی کلیدی در صیانت از اعتماد شکل گرفته در مردم و اثربخشی این فعل ایفا خواهد کرد. با درک وجود خلاء تحقیقاتی در تحلیل و بررسی داده های بازرسی به منظور رصد عملکرد بازرسان و کشف الگوهای رفتاری غیرمتعارف آنان در ثبت نتایج بازرسی ها، پژوهش حاضر به دنبال شناسایی الگوی بازرسی های صوری و به صورت خاص بازرسی های صورت گرفته حول شکایات و گزارش های مردمی و شناسایی بازرسان متخلف است. کشف رفتار غیرمتعارف و متقلبانه بازرسان با تحلیل 1518 ردیف داده ی عملکردی بازرسان با به کارگیری الگوریتم های کا-میانگین برای خوشه بندی و درخت تصمیم، رگرسیون لجستیک، بیز ساده و ماشین بردار پشتیبان به منظور دسته بندی دنبال شد. در ادامه ضمن انجام ارزیابی های درونی و بیرونی برای سنجش کیفیت نتایج، از روش های علم تصمیم با چاشنی نظر خبرگان برای غنی سازی بعد کاربردی پژوهش و حصول لیستی از بازرسان متقلب استفاده گردید که نتیجتا 9 بازرس به عنوان بازرسان متخلف شناسایی و ناهنجاری از نوع جمعی تشخیص داده شد. در پایان نیز به منظور پیشگیری از بروز رفتارهای متقلبانه در دورزدن سامانه بازرسی، راه کارهای سیستمی و پیشنهادهای مدیریتی ارائه گردید.
    کلید واژگان: بازرسی، تخلفات صنفی، شکایات مردمی، کشف تقلب، کشف ناهنجاری
    Jafar Pahlevani, Majid Sheikhmohammady *, Babak Teimourpour
    trust growth in government and justice. Therefore, Simba inspection software data analysis, in order to detect and prevent inspectors' anomaly patterns, can insure the public trust and effectiveness of inspections. Based on a research gap identified in the research area, 1518 performance data were analyzed to detect anomalies. K-means clustering and classifications by decision tree, logistic regression, naïve bayes and support vector machine were employed to detect fraud, of which decision tree and logistic regression were better than others. Then the results synthesized with 243000 inspection report data analysis. In order to enhance practical side of research, data mining and decision science techniques were employed to find the fraudsters. As a result, collective anomaly detected and nine inspectors were identified as fraudsters. Lastly, IT-based solutions like software redesign and managerial tips were mentioned.
    Keywords: Anomaly Detection, Complaints, Fraud Detection, Infringements, Inspection
  • فرانک خونساریان، بابک تیمورپور *، محمدعلی رستگار

    یافتن راهکار هایی برای پیش بینی قیمت، تشکیل سبد سهام بهینه و دستیابی به سود بیشتر از اهداف اساسی فعالان بازار های مالی می باشد. هدف از این پژوهش پیش بینی قیمت دارایی های مالی نظیر چندین سهام بورس، طلا، سکه و تعدادی از ارزهای دیجیتال با استفاده از مدل شبکه عصبی LSTM و سپس تشکیل سبد سهام بهینه با محاسبه میزان بازده، ریسک و معیار شارپ است. داده های استفاده شده از آرشیو وب سایت بورس و اوراق بهادار تهران، وب سایت شبکه اطلاع رسانی طلا، سکه و ارز و همچنین وب سایت خرید و فروش ارزهای دیجیتال می باشد. سری زمانی قیمت دارایی های مورد بررسی طی سال های 2017 تا 2020 میلادی است. همچنین برای ساخت مدل و تحلیل داده ها از زبان برنامه نویسی پایتون و نرم افزار گفی استفاده نمودیم. در پایان مشخص گردید که مدل شبکه عصبی LSTM قادر به پیش بینی قیمت دارایی های مالی با میزان خطای بسیار کم در هر دارایی می باشد و با توجه به میزان معیار شارپ به دست آمده برای هر دارایی مالی و ماتریس همبستگی، سهام وبانک و سهام خبهمن 1 و همچنین ارز های دیجیتال ترون، تتر و بیت کوین سهم بیشتری را در سبد سهام پیشنهادی به خود تخصیص می دهند.

    کلید واژگان: سبد سهام، ارز دیجیتال، پیش بینی قیمت، دارایی های مالی، شبکه عصبی LSTM
    Faranak Khonsarian, Babak Teimourpour *, MohammadAli Rastegar Sorkheh

    Finding solutions for price prediction, forming an optimal portfolio and achieving more profit are the basic goals of financial market activists. The purpose of this research is to predict the price of financial assets such as several stocks, gold, coin and a number of digital currencies using the LSTM neural network model and then form an optimal portfolio by calculating the rate of return, risk and the Sharpe ratio. The data used is from the archives of the Tehran Stock Exchange website, the website of the gold, coin and currency information network, as well as the website of buying and selling digital currencies. The time series of the prices of the investigated assets is between 2017 and 2020. Also, we used Python programming language and Gephi software to build the model and analyze the data. In the end, it was found that the LSTM neural network model is capable of predicting the price of financial assets with a very low error rate in each asset, and according to the Sharpe ratio obtained for each financial asset and the correlation matrix, Vebank stock, Khbahman 1 stock, and Digital currencies TRON, Tether and Bitcoin allocate more shares in the proposed portfolio.

    Keywords: Price prediction, Portfolio, Financial assets, Digital currency, LSTM neuralnetwork
  • Mohammad Heydari, Babak Teimourpour*

    The rise of the Internet and the exponential increase in data have made manual data summarization and analysis a challenging task. Instagram social network is a prominent social network widely utilized in Iran for information sharing and communication across various age groups. The inherent structure of Instagram, characterized by its text-rich content and graph-like data representation, enables the utilization of text and graph processing techniques for data analysis purposes. The degree distributions of these networks exhibit scale-free characteristics, indicating non-random growth patterns. Recently, word co-occurrence has gained attention from researchers across multiple disciplines due to its simplicity and practicality. Keyword extraction is a crucial task in natural language processing. In this study, we demonstrated that high-precision extraction of keywords from Instagram posts in the Persian language can be achieved using unsupervised word co-occurrence methods without resorting to conventional techniques such as clustering or pre-trained models. After graph visualization and community detection, it was observed that the top topics covered by news agencies are represented by these graphs. This approach is generalizable to new and diverse datasets and can provide acceptable outputs for new data. To the author's knowledge, this method has not been employed in the Persian language before on Instagram social network. The new crawled data has been publicly released on GitHub for exploration by other researchers. By employing this method, it is possible to use other graph-based algorithms, such as community detections. The results help us to identify the key role of different news agencies in information diffusion among the public, identify hidden communities, and discover latent patterns among a massive amount of data.

    Keywords: Instagram, Network Science, Social Network Analysis, Graph Mining, Words Co-occurrence
  • AmirHossein Ahmadi, Babak Teimourpour *, Mahtab Mahbood

    Statistics, extraction, analysis are vital in sports science. Information technology and data science will significantly increase the quality of research and decisions of sports clubs and organizations. Currently, many coaches and sports institutions use analytics and statistics that are calculated manually. Sports science shows that winning a match depends on different factors.The purpose of the research is to improve team performance by analyzing social networks, communication networks (such as players' passes and transactions during the match), and analyzing repetitive areas. These results are done by analyzing the data collected from 4 matches of the Persepolis team, including three matches from the first half of the Iranian Premier League in 2018-1399 and a Persepolis match against Al-Sharjah. This research examines the issue from two interconnected aspects: 1- Examining the performance of players individually and as part of a social network. 2- explore the communication network between players and land areas. This analysis uses the innovative method of identifying and classifying motifs.

    Keywords: social network analysis, Graph Analysis, Motif, Frequent Subgraph, centrality
  • Mostafa Akhavan-Safar, Babak Teimourpour *, Mahboube Ayyoubi
    One of the important topics in oncology for treatment and prevention is the identification of genes that initiate cancer in cells. These genes are known as cancer driver genes (CDG). Identifying driver genes is important both for a basic understanding of cancer and for helping to find new therapeutic goals or biomarkers. Several computational methods for finding cancer-driver genes have been developed from genome data. However, most of these methods find key mutations in genomic data to predict cancer driver genes.  methods are dependent on mutation and genomic data and often have a high rate of false positives in the results. In this study, we proposed a network-based method, GeneIC, which can detect cancer driver genes without the need for mutation data. In this method, the concept of influence maximization and the independent cascade model is used. First, a cancer gene regulatory network was created using regulatory interactions and gene expression data. Then we implemented an independent cascade propagation algorithm on the network to calculate the coverage of each gene. Finally, the genes with the highest coverage were introduced as driver genes. The results of our proposed method were compared with 19 previous computational and network methods based on the F-measure metric and the number of detected drivers. The results showed that the proposed method has a better outcome than other methods. In addition, more than 25.49\% of the driver genes reported by GeneIC are new driver genes that have not been reported by any other computational method.
    Keywords: Gene regulatory network, Driver genes, Influence maximization, cancer, Independent Cascade
  • Seyedeh Motahareh Hosseini, Mohammad Aghdasi, Babak Teimourpour*, Amir Albadvi

    The importance of process analysis in engineering, procurement and construction companies (EPC), due to the complexity of the measures, the high level of communication between people, different and non-integrated information systems, as well as the amount of capital involved in these projects is much higher and more challenging. Limited research has been done on exploring business processes in these companies. In this study, in order to better and more accurately analyze the company's performance, three perspectives of process mining (process flow, case and organizational) is analyzed by using the event logs recorded in the supplier selection process. The results of this study led to the identification of challenges in the process, including repetitive loops, duplicate activities, survey of factors affecting the implementation of the process and also examining the relationships between people involved in the project, which can be used to improve the future performance of the company.

    Keywords: Process Mining, Process Mining perspectives, Complex Construction Project Management, EPC Companies
  • Sajedeh Lashgari, Babak Teimourpour*, Mostafa Akhavan-Safar

    Cancer-causing genes are genes in which mutations cause the onset and spread of cancer. These genes are called driver genes or cancer-causal genes. Several computational methods have been proposed so far to find them. Most of these methods are based on the genome sequencing of cancer tissues. They look for key mutations in genome data to predict cancer genes. This study proposes a new approach called centrality maximization intersection, cMaxDriver, as a network-based tool for predicting cancer-causing genes in the human transcriptional regulatory network. In this approach, we used degree, closeness, and betweenness centralities, without using genome data. We first constructed three cancer transcriptional regulatory networks using gene expression data and regulatory interactions as benchmarks. We then calculated the three mentioned centralities for the genes in the network and considered the nodes with the highest values in each of the centralities as important genes in the network. Finally, we identified the nodes with the highest value between at least two centralities as cancer causal genes. We compared the results with eighteen previous computational and network-based methods. The results show that the proposed approach has improved the efficiency and F-measure, significantly. In addition, the cMaxDriver approach has identified unique cancer driver genes, which other methods cannot identify.

    Keywords: Cancer-causing genes, Transcriptional regulatory network, Maximization, Centrality, Intersection
  • Majid Rahimi, Babak Teimourpour *, Mostafa Akhavansafar
    Background
    Cancer is a group of diseases that have received much attention in biological research because of its high mortality rate and the lack of accurate identification of its root causes. In such studies, researchers usually try to identify cancer driver genes (CDGs) that start cancer in a cell. The majority of the methods that have ever been proposed for the identification of CDGs are based on gene expression data and the concept of mutation in genomic data. Recently, using networking techniques and the concept of influence maximization, some models have been proposed to identify these genes.
    Objectives
    We aimed to construct the cancer transcriptional regulatory network and identify cancer driver genes using a network science approach without the use of mutation and genomic data.
    Materials and Methods
    In this study, we will employ the social influence network theory to identify CDGs in the human gene regulatory network (GRN) that is based on the concept of influence and power of webpages. First, we will create GRN Networks using gene expression data and Existing nodes and edges. Next, we will implement the modified algorithm on GRN networks being studied by weighting the regulatory interaction edges using the influence spread concept. Nodes with the highest ratings will be selected as the CDGs.
    Results
    The results show our proposed method outperforms most of the other computational and network-based methods and show its superiority in identifying CDGs compared to many other methods. In addition, the proposed method can identify many CDGs that are overlooked by all previously published methods. 
    Conclusions
    Our study demonstrated that the Google’s PageRank algorithm can be utilized and modified as a network based method for identifying cancer driver gene in transcriptional regulatory network. Furthermore, the proposed method can be considered as a complementary method to the computational-based cancer driver gene identification tools.
    Keywords: Cancer Driver Gene, Diffusion, PageRank, Transcriptional Regulatory Network (TRN)
  • Fateme Shahrabi Farahani, Meysam Alavi, Mina Ghasemi, Babak Teimourpour *
    Today, due to the large volume of data and the high speed of data production, it is practically impossible to analyze data using traditional methods. Meanwhile, data mining, as one of the most popular topics in the present century, has contributed to the advancement of science and technology in a number of areas. In the recent decade, researchers have made extensive use of data mining to analyze data. One of the most important issues for researchers in this field is to identify common mainstreams in the fields of data mining and to find active research fields in this area for future research. On the other hand, the analysis of social networks in recent years as a suitable tool to study the present and future relationships between the entities of a network structure has attracted the researcher’s scrutiny. In this paper, using the method of co-occurrence analysis of words and analysis of social networks, the scientific structure and map of data mining issues in Iran based on papers indexed during the years 1388 to 1398 in the Civilica database is drawn, and the thematic trend governing research in this area has been reviewed. The results of the analysis show that in the category of data mining, concepts such as clustering, classification, decision tree, and neural network include the largest volume of applications such as data mining in medicine, fraud detection, and customer relationship management have had the greatest use of data mining techniques.
    Keywords: Data mining, Scientific map, Co-word Analysis, social network analysis
  • فرزانه صندوقداران، امیر البدوی*، بابک تیمورپور
    هدف

    در دهه های اخیر، با توجه به آن که هزینه های جذب مشتری جدید به طور پیوسته در حال افزایش است، توجه به نگهداری مشتریان و بالا بردن وفاداری آن ها، برای سودآوری سازمان ها بسیار مهم و حساس است. از این رو سازمان ها برنامه های مختلفی را برای افزایش ماندگاری مشتریان خود اجرا می کنند. از سوی دیگر همه مشتریان سودآوری یکسانی برای سازمان ندارند و منابع محدود سازمان باید صرف مشتریان با ارزش شود. هدف این پژوهش ارایه مدلی ریاضی برای انتخاب بهینه مشتریان هدف جهت برنامه های ماندگاری و همچنین انتخاب میزان هزینه برای هر مشتری است.

    روش شناسی:

    اجرای تحقیق حاضر در 3 گام اصلی صورت گرفت. در گام اول با استفاده از داده کاوی احتمال رویگردانی مشتریان بدست می آید. در گام دوم ارزش دوره ی عمر مشتری محاسبه می شود و در گام آخر مدل بهینه سازی پیشنهادی حل می شود، برای حل مدل از LP-Metric استفاده شده و حل آن در نرم افزار GAMS انجام شده است. در حل مدل پیشنهادی داده های واقعی یکی از سازمان های بیمه ای کشور مورد استفاده قرار گرفت.

    یافته ها:

    یک مدل بهینه سازی دو هدفه بر مبنای ارزش دوره عمر مشتری ارایه شده است. یکی از توابع هدف مربوط به بیشینه کردن ارزش دوره عمر مشتری با اجرای برنامه ماندگاری و هدف دیگر مربوط به کمینه کردن هزینه های برنامه است.

     نتیجه گیری:

     با حل مدل بهینه سازی پیشنهادی، یک نمودار پارتو بدست آمده که با در نظر گرفتن نظر کارشناسان، هر نقطه بر روی این نمودار می تواند یک جواب بهینه برای انتخاب مشتریان هدف برنامه های ماندگاری و شیوه هزینه کرد برای آن ها باشد.

    کلید واژگان: ارزش دوره عمر مشتری، رویگردانی مشتری، برنامه های ماندگاری، داده کاوی، بهینه سازی دو هدفه
    فرزانه صندوقداران, Amir Albadvi *, Babak Teimourpour
    Objective

     Developing an optimization model based on CLV to customer targeting for the retention program

    Method

      This research consists of three steps. In the first step, using data mining methods, the churn probability of each customer is determined. In the second step, the customer lifetime value for each customer is calculated and finally, in the last step, the model is solved using the LP-metric model in GAMS software. we used the proposed model for one of the insurance organizations of the country. 

    Findings

      A bi-objective optimization model based on CLV is proposed for selecting target customers for the retention program and selecting relevant costs for each customer. One of the objective functions is set to maximize CLV from performing retention programs and the other one is set to minimize program costs.

    Conclusion

      By solving the proposed optimization model, a Pareto diagram is obtained that taking into account the opinion of experts, each point on this diagram can be an optimal answer for selecting the target customers of retention programs and how to spend for them.

    Keywords: Customer Lifetime Value, Retention Programs, Data mining, Bi-objective Optimization
  • سید کمال چهارسوقی*، ابوالفضل نبوی، بابک تیمورپور
    یکی از جنبه های مهم نگهداری و تعمیرات مبتنی بر شرایط (CBM) پیش بینی عمر مفید باقی مانده (RUL) بر اساس سوابق گذشته و وضعیت کنونی دستگاه است و تحلیل روغن روانکار یکی از روش های CBM است که به علت تماس مستقیم با دستگاه شرایطش بیانگر وضعیت سلامتی دستگاه است. در فرآیند CBM داده های زیادی تولید و انباشته می شود اما دانش موجود در این داده ها به طور کامل قابل درک نیست و باعث ضایع شدن منابعی گران بها می شود. برای استخراج اطلاعات و دانش از این داده ها به استفاده از روش هایی مانند داده کاوی نیاز است. در این پژوهش بر اساس تعریف RUL بهترین مدل پیش بینی زمان کارکرد باقی مانده تا وضعیت بحرانی برای یک مدل بلدوزر بر اساس سوابق تحلیل روغن موتور (مجموعه داده ای با 2700 رکورد و 129 ویژگی) با راهکار داده کاوی ساخته شده است. برای ساخت بهترین مدل، بعد از آماده سازی مجموعه داده مناسب با 49 رکورد و چهار ویژگی مدل هایی با روش های رگرسیون و شبکه عصبی ساخته شده است. به علت امکان انجام شدن فعالیت تعویض روغن در فواصل نمونه گیری ها، مدل ها با دو روش اعمال مقادیر ویژگی های مستقل ساخته شده اند. بر اساس ارزیابی عملکرد مدل ها بهترین مدل با شبکه عصبی و روش دوم اعمال مقادیر ویژگی های مستقل که استفاده از مقادیر جدید (تجمعی) دو ویژگی مستقل (Fe, Cu) و مقدار واقعی (غیر تجمعی) یک ویژگی مستقل (Vis40) بوده با خطای پیش بینی 23526.662 -/+ 958559.033 ساخته شده است
    کلید واژگان: نگهداری و تعمیرات مبتنی بر شرایط، تحلیل روغن، عمر مفید باقی مانده، داده کاوی
    Seyed Kamal Chaharsooghi *, Abolfazl Nabavi, Babak Teimourpour
    One of the important aspects of Condition Based Maintenance (CBM) is the prediction of remaining useful life (RUL) based on past records and current state of the device and lubricant oil analysis is one of the methods of CBM which due to its direct contact with the device, its condition expresses the device's health. In the CBM process a large mass of data is generated and accumulated, but the knowledge included in this data cannot be fully understood and result in the loss of valuable resources. To extract information and knowledge from these data, it is necessary to use methods such as data mining. In this study, based on the definition of RUL, the best prediction model of remaining operating time for a bulldozer model until critical state has been created with data mining solution based on engine oil analysis records (dataset with 2700 records and 129 features). To create the best model, regression and neural network models have been created after preparing the proper dataset with 49 records and 4 features. Due to the feasibility of oil change at sampling intervals, the models have been created using two methods of applying independent features values. Based on the performance evaluation of the models, the best model with neural network and the second method of applying independent features values have been created with prediction error 958559.033 +/- 23526.662, which are to use new values (cumulative) of two independent features (Fe, Cu) and the actual value (non-cumulative) of an independent feature (Vis40)
    Keywords: Condition based maintenance, Oil analysis, Remaining useful life, Data Mining
  • عبدالرضا مصدق، امیر البدوی*، محمد مهدی سپهری، بابک تیمورپور
    برای چندین دهه، سازمان ها بیش از مشتریان بر نشان تجاری و محصولاتشان تمرکز می کردند؛ اما اکنون بنگاه های اقتصادی بر ایجاد و حفظ ارتباط موثر با مشتریان متمرکز شده اند. در چنین شرایطی شناخت مشتریان و نیازهای آنان به امری حیاتی برای سازمان ها تبدیل شده است. یکی از پرکاربردترین روش های شناخت مشتریان، بخش بندی آنها به گروه های متجانس و شناخت ویژگی های هر بخش است؛ اما شیوه های سنتی و ایستای بخش بندی مشتریان پاسخگوی تغییرات سریع بازارهای پویای امروزی نیست. در عصر ارتباطات و فناوری های نوین، مشتریان مدام در بین بخش های مختلف جابه جا می شوند. شناخت الگوهای تغییرات و چگونگی پویایی بخش های مشتریان، عاملی کلیدی برای کسب بینش عمیق از مشتریان، پیش بینی تغییرات بازار و حتی هدایت موثر آن است. عمده پژوهش های پیشین در این موضوع سعی در تدوین الگویی عمومی و میان صنعتی برای تفسیر پویایی مشتریان کرده اند؛ حال آنکه ماهیت بخش های مشتریان و الگوهای پویایی از صنعتی به صنعت دیگر کاملا متفاوت است. پژوهش حاضر با در نظر گرفتن مشخصات یک صنعت خاص (صنعت بانکداری)، الگوهای پویایی مشتریان را با استفاده از ابزارهای تحلیل داده های بزرگ کاوش و مطالعه کرده است. نتایج این مطالعه، هشت گونه از الگوهای پویایی و روابط میان آنها را در صنعت مطالعه شده آشکار ساخته و با استفاده از آنها، راهکارهایی برای پیش بینی پویایی آینده مشتریان و هدایت آن برای ارتقای اثربخشی فعالیت های بازاریابی، پیشنهاد داده است.
    کلید واژگان: پویایی مشتریان، الگوهای پویایی خاص صنعت، ارزش طول عمر مشتری، صنعت بانکداری، ابزارهای تحلیل داده های بزرگ
    Abdolreza Mosaddegh, Amir Albadvi *, Mohammad Mehdi Sepehri, Babak Teimourpour
    For decades, enterprises focused on brand and products rather than the customers. But, now, economic enterprises focused on building and maintaining effective customer relationships. In such situations, the recognition of customers and their needs has become vital for organizations. One of the most widely used methods for recognizing customers is to segment them into homogeneous groups and recognize the characteristics of each sector, but traditional and static segmentation of customers is not able to respond to the rapid changes in today's dynamic markets. In the era of modern communication and technology, customers are constantly moving between different segments. Knowing patterns of change and the dynamics of customer segments is a key factor in gaining a deep insight into customers, predicting market changes, and even managing them effectively. Major studies in the literature attempt to develop a general and Cross-industry model for interpreting the dynamics of customers, while the nature of customer segments and the dynamic patterns from industry to industry are completely different. The present study, with the consideration of the characteristics of a particular industry (banking industry), explores the dynamics of customers using big data analytics. The results revealed eight categories of patterns and associations which can be proposed to predict the future dynamics of customers and direct it to improve effectiveness of marketing activities in the related industry.
    Keywords: Customers Dynamics, Industry-specific Patterns, Customer Lifetime Value (CLV), Banking Industry, Big Data Analytics
  • اسماعیل علی نژاد، بابک تیمورپور*

    اجتماع‏یابی (کشف اجتماعات) یکی از شاخه‏های نوظهور و پرطرف‏دار در علم داده کاوی و تحلیل شبکه‏های اجتماعی است که کاربردهای فراوانی در کشف و تحلیل اجتماع ها در سایت های اینترنتی، شبکه های زیستی، علمی و پژوهشی و غیره دارد. اجتماع‏یابی صفحات اینترنتی می‏تواند به طور ویژه به مدیران سایت‏های اینترنتی در تخصیص پهنای بهینه به شبکه‏ صفحات وب تحت نظارتشان کمک کند. در اکثر روش های اجتماع‏یابی موجود فقط از توپولوژی شبکه (ارتباطات، یال‏ها) برای گروه‏بندی گره‏ها (صفحات وب) استفاده می‏شود؛ درحالی که نتایج پژوهش‏های اخیر نشان داده ‏است که این‏گونه روش‏ها باید به گونه ای تغییر کند که در آن ها علاوه بر توپولوژی، ویژگی های ذاتی گره‏ها نیز در فرآیند اجتماع‏یابی لحاظ شود. ازاین رو در این مقاله برای اولین بار با لحاظ کردن هم زمان ویژگی های ذاتی صفحات وب و ارتباطات میان آن ها، یک مدل ریاضی برای کشف اجتماعات در شبکه‏های اینترنتی توسعه داده شده است. روش پیشنهادی این پژوهش بدین صورت است که برای لحاظ کردن ویژگی‏ها در فرآیند اجتماع‏یابی، ابتدا با استفاده از یک رویکرد ریاضی، میزان شباهت صفحات وب به کمک یک سنجه شباهت (مانند جاکارد یا ضریب انطباق) و بردار ویژگی‏ها محاسبه و به عنوان وزن به یال های موجود بین آن ها در شبکه اینترنتی موردنظر افزوده می‏شود. با این کار عملا یک شبکه اینترنتی ویژگی‏دار با یال های غیر موزون به یک شبکه بدون ویژگی با یالهای موزون تبدیل می‏شود. سپس با استفاده از یک مدل ریاضی (که مختص شبکه‏هایی با یال های موزون است)، اجتماعات موجود در این شبکه موزون کشف می‏شود. برای اعتبارسنجی و اثبات کارایی، در قالب آزمون‏های فرض آماری ادعاشده است که کیفیت اجتماعات کشف‏شده توسط رویکرد ریاضی پیشنهادی (که ویژگی‏های صفحات وب را لحاظ می‏کند) به طور آماری بهتر از مدل‏های ریاضی پیشین (که از ویژگی‏ها چشم‏پوشی می‏کند) است. نتایج آزمون‏های‏ آماری روی شبکه اینترنتی واقعی نشان می‏دهد که مدل پیشنهادی این پژوهش در حالتی که از معیار جاکارد برای محاسبه میزان شباهت صفحات وب استفاده می‏کند به طور معنی داری (با P-value=0.01) باعث کشف اجتماعاتی بهتر در قیاس با مدل‏های ریاضی پیشین شده است. همچنین نتایج دیگر آزمون‏‏های آماری نیز نشان می‏دهد که انتخاب سنجه شباهت متناسب با ماهیت شبکه، تاثیر بسزایی در میزان کیفیت رویکرد پیشنهادی دارد.

    کلید واژگان: اجتماع ‏یابی، بهینه ‏سازی پودمانگی، توپولوژی شبکه، شبکه اینترنتی، صفحات وب، مدل ریاضی، ویژگی ‏های گره‏
    Esmaeil Alinezhad, Babak Teimourpour *

    Community detection is one of the emerging and well-known topics in the area of data mining and social network analysis, which has wide variety applications in discovering communities in real-world networks such as biological networks, internet weblogs, scientific and research websites, etc. Web community detection can especially help admins assign the optimal bandwidth to the websites of theirown networks. Most of web community detection approaches only use the network topology to discover the web communities. However, the results of the most recent researches show that traditional community detection methods have to be substantially modified to consider web attributes as well as network topology. Therefore, in this paper, a mathematical programming approach is developed for community detection in attributed internet networks by simultaneously considering both network topology and node attributes. In this approach, first, similarities of web pages are calculated using node attributes and a desired similarity measure and are considered as the weight of the corresponding edges. Then, communities of the resulted weighted network will be detected by the proposed mathematical model. To validate and prove the efficiency, it is hypothesized that the detected communities of the proposed approach have a better quality than that of previous models. Experimental results demonstrate that the proposed approach has the ability to significantly improve the quality of detected web communities, when the model uses the Jaccard index. However, the results of other hypotheses indicate that the correct selection of similarity measure has a significant impact on the quality of the detected communities.

    Keywords: community detection, Internet network, mathematical model, Modularity optimization, Network topology, Node attributes, web pages
  • Ebrahim Mazrae Farahani, Reza Baradaran Kazemzade*, Amir Albadvi, Babak Teimourpour
    Studying the social networks plays a significant role in everyone’s life. Recent studies show the importance and increasing interests in the subject by modeling and monitoring the communications between the network members as longitudinal data. Typically, the tendency for modeling the social networks with considering the dependency of an outcome variable on the covariates is growing recently. However, these studies fail in considering the possible correlation between the responses in the modeling of social networks. Our study use generalized linear mixed models (GLMMs) (also referred to as random effects models) to model the social network according to the attributes of nodes in which the nodes take a role of random effect or hidden effect in the modeling. The likelihood ratio test (LRT) statistics is implemented to detect change points in the simulated network streams. Also, in the simulation studies, we applied root mean square Error (RMSE) and standard deviation criteria for choosing an appropriate distribution for simulation data. Also, our simulation studies demonstrates an improvement in the average run length (ARL) index in comparison to the previous studies.
    Keywords: Social network monitoring, Generalized Linear Mixed Models, likelihood ratio test (LRT), Average Run Length (ARL)
  • Amir Albadvi *, Hossein Hashemi, Mohammad Reza Amin-Naseri, Babak Teimourpour
    Reliability is a fundamental factor in the operation of bus transportation systems for the reason that it signifies a straight indicator of the quality of service and operator’s costs. Todays, the application of GPS technology in bus systems provides big data availability, though it brings the difficulties of data preprocessing in a methodical approach. In this study, the principal component analysis is utilized to systematically assess the reliability indicators based on automatic vehicle location (AVL) data. In addition, the significant reliability indicators affecting the bus reliability are identified using a statistical analysis framework. The proposed bus reliability assessment framework can be applied to each bus route or a complete network. The proposed methodology has been validated using computational experiments on real-world AVL datasets extracted from the bus system in Qazvin, Iran. The analysis indicates that 1) on-time performance, 2) headway regularity, 3) standard deviation of the travel time of the buses, and 4) 50th percentile travel time are key indicators the reliability of bus services. The potential of the proposed methodology is discussed to provide insights for bus operators. Using the proposed approach in the article, the desirable reliability status of bus lines is identifiable from the point of view of key stakeholders, and the ways to improve reliability can be more clearly defined.
    Keywords: AVL data, bus reliability, regression analysis, mobility management
  • ابراهیم عباسی، بابک تیموری، عارفه مولایی، زهرا اسماعیلی
    هدف این پژوهش، استفاده از ارزش در معرض ریسک شرطی به عنوان معیار سنجش ریسک نامطلوب در تشکیل سبد سهام بهینه در بازار بورس اوراق بهادار تهران است. CVaR به عنوان میانگین وزنی زیان مورد انتظار فراتر از VaR تعریف می شود و دارای ویژگی تحدب و زیر جمع پذیری است. داده های مورد استفاده در این پژوهش، بازده 15 روزه 45 شرکت در دوره زمانی 01/07/1388 تا 31/05/1392 است. با استفاده از آزمون نقطه شکست چاو، تاریخ 01/07/1392 به عنوان نقطه شکست بازار انتخاب شد؛ بنابراین داده ها به دو دوره زمانی قبل و بعد از نقطه شکست تقسیم شدند. نتایج آزمون علامت زوج - نمونه ای نشان می دهد CVaR دوره دوم بزرگ تر از دوره اول است و متناسب با آن بازده مورد انتظار بالاتری در دوره دوم وجود دارد؛ سپس 10 پرتفوی بهینه برای هریک از دوره ها و مرز کارای مربوطه رسم شد. مرز کارا نیز نشان از رونق بازار بورس اوراق بهادار تهران در دوره دوم است.
    کلید واژگان: سبد سهام بهینه، ارزش در معرض ریسک شرطی، مرز کارا، نقطه شکست بازار
    Ebrahim Abbasi, Babak Teimourpour, Arefeh Molaee, Zahra Esmaeili
    The purpose of the study, the use of conditional value at risk as the Downside Risk Measure on portfolio optimization in the Tehran Securities Exchange Organization. CVaR is defined as the weighted average of expected loss exceeding Value-at Risk (VaR) and having appealing features such as sub-additivity and convexity. The data used in this study is return of 15-day for 45 componeis from 07/01/1388 to 05/31/1392. By Chow breakpoint test, was Selected on 01/07/1392 as a market breakpoint. Therefore, the data were divided into two time periods before and after the break point. The results of the sign test reveal that the second period CVaR is greater than the first period CVaR and the corresponding expected return is higher in the second period. So, ten optimal portfolios for each of periods on various confidence levels were calculated and the efficient frontier was drawn.Also the efficient frontier shows that the Tehran Stock Exchange is prospered in second period.
    Keywords: Optimal Portfolio, Conditional Value at Risk, Efficient Frontier, Market Break Point
  • Saeed Nasehimoghaddam*, Mehdi Ghazanfari, Babak Teimourpour
    As a way of simplifying, size reducing and making sense of the structure of each social network, blockmodeling consists of two major, essential components: partitioning of actors to equivalence classes, called positions, and clarifying relations between and within positions. Partitioning of actors to positions is done variously and the ties between and within positions can be represented by density matrices, image matrices and reduced graphs. While actor partitioning in classic blockmodeling is performed by several equivalence definitions, such as structural and regular equivalence, generalized blockmodeling, using a local optimization procedure, searches the best partition vector that best satisfies a predetermined image matrix. The need for known predefined social structure and using a local search procedure to find the best partition vector fitting into that predefined image matrix, makes generalized blockmodeling be restricted. In this paper, we formulate blockmodel problem and employ a genetic algorithm to search for the best partition vector fitting into original relational data in terms of the known indices. In addition, during multiple samples and various situations such as dichotomous, signed, ordinal or interval valued relations, and multiple relations the quality of results shows better fitness to original relational data than solutions reported by researchers in classic, generalized, and stochastic blockmodeling field.
    Keywords: Social Network Analysis (SNA), blockmodeling, Genetic Algorithm, likelihood ratio statistics, G2, Multi objective optimization
  • Seyed Ali Lajevardy, Mehrdad Kargari *, Babak Teimourpour, Siamak Kargar
    Introduction
    Every year several disease outbreaks, such as influenza-like illnesses (ILI) and other contagious illnesses, impose various costs to public and non-government agencies. Most of these expenses are due to not being ready to handle such disease outbreaks. An appropriate preparation will reduce the expenses. A system that is able to recognize these outbreaks can earn income in two ways: first, selling the predictions to government agencies to equip and make preparations in order to reduce the imposed costs and second, selling predictions to pharmaceutical companies to guide them in producing the required drugs when a disease spreads. This production can specify probable markets to these companies.
    Methods
    Both earning methods would be considered in this modeling and costs and incomes will be discussed according to basic business models (especially in the health field). To execute this model, the internet is used as a recipient of information from the doctors and the service providers for prediction.
    To ensure collaboration of doctors in the data collection process, the amount of money that is paid is proportional to the rate of sending the patients’ information. On the other hand, customers can access outbreak prediction information about a specific illness after payment or subscription of system for monthly periods. All the money transfered in this system would be via online credit systems.
    Results
    This business model has three main values: recognizing disease outbreaks at the right time, identifying factors and estimating the spreading rate of the disease and, the categorization of customers in this model is based on the value provided including pharmaceutical companies and importers of drugs, the government, insurance companies, universities and research centers. By considering various markets, this model has the ROI of 0.5 which means the investment in it reverses in 6 months.
    Conclusion
    According to the results, the business model developed in this study, has fair value and is feasible and suitable for the web. This model develops medical information network and proper marketing, earns good profits and the most critical resource of it is the algorithm that detects the disease outbreak which must be properly constructed and used.
    Keywords: Disease Outbreak, Business Coalition Healthcare, Internet, Health Services Availability
  • Nima Riahi, Seyyed, Mahdi Hosseini, Motlagh*, Babak Teimourpour
    Background And Objectives
    Efficient cost management in hospitals’ pharmaceutical inventories have the potential to remarkably contribute to optimization of overall hospital expenditures. To this end, reliable forecasting models for accurate prediction of future pharmaceutical demands are instrumental. While the linear methods are frequently used for forecasting purposes chiefly due to their simplicity, they have serious deficiencies in capturing nonlinearities in real-world problems. On the other hand, real world time series data are rarely pure linear or nonlinear, calling for development of forecasting models accounting for both these features of the data. To help meeting this need in the health/healthcare domain, this study undertook development of a hybrid framework consisting of a linear and a nonlinear component to improve forecasting of operating rooms’ pharmaceutical demand.
    Methods
    A hybrid modeling framework combining AutorRgressive Integrated Moving average (ARIMA) as the linear component, and Artificial Neural Network (ANN) as the nonlinear component was developed. The method encompasses three phases: 1) Fitting a linear ARIMA model to the targeted time series, (2) Building an ANN model based on the residuals of the ARIMA model, and (3) Build the hybrid model by combining ARIMA and ANN models for the final forecast. Using the pharmaceutical inventory database of the Iranian Mohem Hospital for fitting AMIRA model and training ANN model, the forecast performance of all three models was compared by calculating the corresponding mean squared error and mean absolute error values, and by superimposing the time series patterns of the operating rooms’ drug demand independently predicted by each model to the corresponding observed pattern.
    Findings
    Both quantitative and intuitive comparisons demonstrated that our hybrid ARIMA-ANN framework outperforms forecasting capability of either ARIMA or ANN models. In particular, the hybrid model showed remarkably superior capability in capturing the nonlinear behavior of the operating rooms’ pharmaceutical demand time series.
    Conclusions
    Our proposed framework sets a ground for developing mathematical and computational forecasting models with ever higher predictive accuracy and supports the promotion of using such forecasting models in practical cost optimization in health facilities.
    Keywords: Hospital, Operating Room, Pharmaceutical demand, Forecasting, Time Series Data, Data Mining, Artificial Neural Networks
  • Pejman Shadpour, Babak Teimourpour, Rouhangiz Asadi
    Background And Objectives
    Active presence of hospitals on the Internet is becoming a hallmark of hospitals’ commitment to quality healthcare services delivery. For insightful planning towards a strong Internet-based information delivery and communication, there is a need for continuous monitoring of hospital website’s status. Built on this need, this paper provides, for the first time, a ranking of a large number of Iranian hospital websites based on standard webometric methods.
    Methods
    The study targeted ranking of all hospitals affiliated with the Iranian Ministry of Health and Medical Education. Name and URLs of the hospitals were obtained from the official website of the Ministry and then updated using web search, when needed. Hospital websites with un-standard URLs and extremely limited content were excluded from the study, and the remaining websites were analyzed and ranked according to webomeric measures.
    Findings
    A ranking list of 93 hospitals was obtained. The three top-rank websites belong to the hospitals affiliated with Tehran University of Medical Sciences followed by websites of hospitals of Beheshti and Shiraz universities of medical sciences. The top 20 websites belong to hospitals affiliated with only seven medical universities among 17 surveyed. The size, visibility, and richness of hospital websites showed significant intercorrelations (P < 0.001). In addition, regression analysis identified significant linear relationship between hospital websites’ visibility and size (β = 0.6, P < 0.001). On the other extreme, websites of most hospitals affiliated with Babol, Ahwaz, and Hamedan, and Birjand universities of medical sciences constitute the lowest 10 rank group. While these low rank hospital websites slightly differ in size, they share an identical rank (the lowest among) all in terms of visibility and richness.
    Conclusions
    This obtained ranking list of the hospitals can help hospital administrators to evaluate the strength of their on-line presence and plan to improve their status on the web. The fact that the top 20 and the lowest 10 rank hospital websites cluster into a few medical universities highlights the importance of support from holding universities for strong presence of their affiliated hospitals on the web. In addition, identification of significant positive relationship between size and visibility of hospital websites encourages hospital administrators to synergistically improve their webometric rank by increasing the size of their websites.
    Keywords: Webometrics, Hospital, Web, Website, Information, Communication Technology, Internet, Ranking
  • سیدمحمود ایزدپرست، احمد فراهی، فرامرز فتح نژاد، بابک تیمورپور
    امروزه، نقش مشتریان از حالت پیروی از تولیدکننده، به هدایت سرمایه گذاران، تولیدکنندگان و حتی پژوهشگران و نوآوران مبدل گشته است، به همین دلیل سازمان ها نیاز دارند مشتریان خود را بشناسند و برای آنان برنامه-ریزی کنند. تاکنون از برخی روش های آماری و یادگیری ماشینی برای این منظور استفاده شده است که البته این روش ها به تنهایی دارای محدودیت هایی هستند که در این پژوهش سعی شده است تا با بهره گیری از روش های مختلف داده کاوی تا حد ممکن این محدودیت ها از بین برده و برطبق آن، چارچوبی برای شناسایی مشتریان بیمه بدنه اتومبیل ارائه شود. درواقع، هدف این است تا مشتریانی را که بیشتر به یکدیگر شبیه هستند دسته بندی و با استفاده از این دسته ها و ویژگی های آن، میزان خطرپذیری هر دسته را پیش بینی کرد. حال با استفاده از این معیار (میزان خطرپذیری هر دسته) و نوع بیمه نامه مشتری می-توان میزان خسارت او را پیش بینی کرد که این معیار می تواند کمک شایانی برای شناسایی مشتریان و سیاستگذاری های تعرفه بیمه نامه باشد. برای این منظور، از دو روش داده کاوی4، درخت تصمیم و خوشه بندی برای ایجاد مدل پیش بینی خطرپذیری مشتریان در صنعت بیمه استفاده شده است. البته فن درخت تصمیم برای این منظور نتایج بهتری را به دست آورده است، ولی فن خوشه بندی نیز تفکیک خوبی میان مشتریان ایجاد می کند.
    کلید واژگان: داده کاوی، بیمه، دسته بندی، درخت تصمیم، خوشه بندی، خسارت
    Seyyed Mahmood Izadparast, Ahmad Farahi, Faramarz Fath Nejad, Babak Teimourpour
    Nowadays customers’ role is changed from just accepting the producers, to leading investors, producers, and even researchers and inventors. Therefore, it is necessary for organizations to identify their customers well and to make plans for them. Some statistical and machine-based learning methods are used so far. However these methods alone are not without limitations. Using various methods of data mining, this research was to eliminate those restrictions as far as possible, so that a framework for identification of car insurance customers could be provided. In fact, the purpose was to categorize the most similar customers and to estimate the amount of risk in each category, according to their characteristics. Now, using this scale (i.e. amount of risk in each category) and considering the type of customer’s policy, the level of recompense could be estimated. This criterion can be helpful to identify customers and for making insurance tariff policies. For this purpose, in insurance industry the two data mining methods were been used to estimate customers’ detriment: the decision tree and clustering. Nevertheless, the decision tree method appears to give better results, although at the same, the clustering method generates a good categorization.
    Keywords: Data mining, insurance, categorize, decision tree, clustering, detriment
  • بابک تیموریان، محمد مهدی سپهری، لیلا پزشک
    چکیده مقالات نمایه شده در موسسه اطلاعات علمی1 (ISI) یکی از معتبرترین شاخص های سنجش علم و فناوری می باشد که طبقه بندی موضوعی آن ها یکی از چالش های بزرگ مدیریت فناوری است. در این مقاله سعی شده است با استفاده از یک روش نوین متن کاوی2 به نام SUTC، مقالات متخصصان ایرانی در حوزه فناوری نانو که در مجلات ISI نمایه شده اند دسته بندی شوند. این دسته بندی می تواند معیاری مناسب برای سیاست گذاران در شناخت توانمندی های کشور در زمینه های مختلف تحقیقاتی فناوری نانو قرار گیرد. در این راستا، ابتدا استانداردهای معتبر در فناوری نانو با یکدیگر ادغام شده و طبقه بندی جامعی برای نانومواد حاصل گردیده است. سپس، با استفاده از روش های بازیابی اطلاعات3 و متن کاوی، مقالات بدون دانش پیشین از برچسب دسته ها به طور هوشمند دسته بندی شده اند. به منظور ارزیابی روش طراحی شده، دسته بندی هوشمند مقالات با دسته بندی مقالات به وسیله خبرگان حوزه نانو مقایسه شده است. نتایج، حاکی از صحت مناسب روش ارائه شده است.
    کلید واژگان: علم سنجی، فناوری نانو، متن کاوی، طبقه بندی متون، خوشه بندی، معیار سیلوئت
    Mohammad Mehdi Sepehri, Babak Teimourpour, Leila Pezesh
    The ISI (Institute for Scientific Information) index is one of the most valuable and frequently used indicators for assessing indexed papers in science and technology journals. Categorization of these papers is a big challenge in management of technology. This paper introduces a new text categorization method - Silhouette based Unsupervised Text Categorization (SUTC). This method has been used for classifying Iranian nanotechnology papers indexed in ISI. First, a few standards are combined to make a comprehensive hierarchy of nanomaterials. Then, by applying information retrieval and text mining methods, papers are categorized intelligently without prior knowledge of class labels. The method is validated by comparing acquired class labels to the selected papers labeled by experts. Our analysis shows acceptable accuracy.
بدانید!
  • در این صفحه نام مورد نظر در اسامی نویسندگان مقالات جستجو می‌شود. ممکن است نتایج شامل مطالب نویسندگان هم نام و حتی در رشته‌های مختلف باشد.
  • همه مقالات ترجمه فارسی یا انگلیسی ندارند پس ممکن است مقالاتی باشند که نام نویسنده مورد نظر شما به صورت معادل فارسی یا انگلیسی آن درج شده باشد. در صفحه جستجوی پیشرفته می‌توانید همزمان نام فارسی و انگلیسی نویسنده را درج نمایید.
  • در صورتی که می‌خواهید جستجو را با شرایط متفاوت تکرار کنید به صفحه جستجوی پیشرفته مطالب نشریات مراجعه کنید.
درخواست پشتیبانی - گزارش اشکال