فهرست مطالب

International Journal of Web Research
Volume:2 Issue: 2, Autumn-Winter 2019

  • تاریخ انتشار: 1398/09/10
  • تعداد عناوین: 6
|
  • Faezeh Forootan *, Mohammad Rabiei Pages 1-8
    E-commerce websites, based on their structural ontology, provides access to a wide range of options and the ability to deal directly with manufacturers to receive cheaper products and services as well as receiving comments and ideas of the users on the provided products and services. This is a valuable source of information, which includes a large number of user reviews. It is difficult to check the bulk of the comments published manually and non-automatically. Hence, sentiment analysis is an automated and relatively new field of study, which extracts and analyzes people's attitudes and emotions from the context of the comments. The primary objective of this research is to analyze the content of users' comments on online sale e-commerce websites of handcraft products. Sentiment analysis techniques were used at sentence level and machine learning approach.  First, the pre-processing steps and TF-IDF method were implemented on the comments text. Next, the comments text were classified into two groups of products and services comments using Support Vector Machine (SVM) algorithm with 99.2% accuracy. Finally, the sentiment of comments was classified into three groups of positive, negative and neutral using XGBoost algorithm. The results showed, 95.23% and 95.12% accuracies for classification of sentiments in comments about products and services, respectively.
    Keywords: Machine Learning, Opinion mining, Online Reviews, Sentiment Classification, TF-IDF, Xgboost
  • Seyed Faridoddin Kiaei *, Mohammad Dehghan Rouzi, Saeed Farzi Pages 9-14
    Being aware of people's attitudes and emotions about a specific person or an event can have a high impact on the decisions of individuals and organizations. With the rise of social networks, specifically Instagram, many people are sharing their attitudes on this social network. Analyzing the emotions of users of this social network can help managers make organizational decisions and predict essential events such as elections. In this research, the EAS system designed and implemented to extract emotions and visualize them. As a practical example, the Instagram users' feelings about the two main candidates for the 12th Iranian presidential election also examined. The data were Instagram Persian comments collected using a developed crawler. The result shows a more positive feeling about Rouhani in comparison with Raeisi. Also, the lexicon-based analysis of Rouhani revealed a high level of trust emotion, along with anger and disgust. The crawled and preprocessed dataset is publicly available at https://github.com/sfdk74/EAS.
    Keywords: Emotion Analysis, visualization, Instagram, Election
  • Raana Saheb-Nassagh *, Majid Asgari, Behrouz Minaei-Bidgoli Pages 15-22
    The task of extracting semantic relations from raw data is called relation extraction. One of the most important fields in open information extraction is the automatically extraction of relations in any domain, especially in web mining. There are many works and approaches for relation extraction in English and other languages. Some of these approaches are based on parsing trees. Dependency parsing in the Persian language is difficult and time-consuming, since Persian is a low resource language and has also a dependency grammar and lexical structure, which affects also the speed of relations extraction in Persian. In this paper we will introduce a fast relation extraction method in Persian called RePersian. RePersian is dependent on part-of-speech (POS) tags of a sentence and special relation patterns, which are extracted by analyzing sentence structures in Persian. For finding relation patterns, RePersian searches through POS-tags that are given in regular expression forms. By matching the correct POS pattern to a relation pattern, RePersian extracts the semantic relations in a sentence. We appraise RePersian in two different scenarios on the Dadegan Persian dependency tree dataset. RePersian had on average the precisions 78.05%, 80.4% and 54.85% in finding the first argument on a relation, the second argument and the right relation between them.
    Keywords: Relation Extraction, Persian language, Regex, POS Tag
  • Hossein Sadr, Mozhdeh Nazari *, Mir Mohsen Pedram, Mohammad Teshnehlab Pages 23-35
    Large number of semantic relatedness measures have been presented since the last decades.  In spite of an extensive number of studies that have been conducted in this field, the understanding of their foundation is still limited in real world applications. In this paper, the state-of-the-art semantic relatedness measures are surveyed and in the following a unified topic-based models is proposed to highlight their equivalences and propose bridges between their theoretical bases. Presentation of a comprehensive unified approach of topic based models induces readers to have common understanding of them in spite of the complexities and differences between their architecture and configuration details. Moreover, it may underlie fundamental development of these models. Comprehensive experiments in application of semantic relatedness of geographic phrases have been conducted to evaluate topic based models in comparison to ontology-based models. Based on the obtained results, not only topic-based models in comparison to ontology-based models confront with fewer restrictions in real world, but also their performance in computing semantic relatedness of geographic phrases is significantly superior to ontology-based models.
    Keywords: Semantic Relatedness, Topic-based Models Latent Semantic Analysis, Latent Dirichlet Allocation, Explicit Semantic Analysis, Geographical Information Science Introduction
  • Alireza Mansouri *, Fattaneh Taghiyareh Pages 36-44
    Knowing the current public opinion and predicting its trend using opinion formation models is very applicable. The social impact model of opinion formation is a discrete binary opinion model. It describes how interactions among individuals and sharing their opinions about a specific topic in a social network affect the dynamics of their opinions and form the opinion of society. The society could be an online social network. In this research, we considered the effect of segregation on opinion formation. Segregation is a phenomenon that happens due to homophily and is measured based upon network topology. Homophily is the tendency of individuals to interact with others who share similar traits. We used scale-free networks to model interactions between individuals. The social impact model includes a noise parameter, which is the stochastic part of the model, dealing with the inexplicable behavior of individuals and the effects of other influentials, e.g., mass media. Since this noise is a white noise with no bias toward any possible opinion, for simplicity, we assumed a noise-free social impact model, which is valid in equilibrium analysis we considered. The results reveal that with the same attributes for the individuals, the more segregated opinion group dominates the less segregated opinion group on average. Therefore, with the same population size and individual characteristics of both opinion groups, segregation is an overall influential factor for opinion formation. A more segregated opinion group attracts some individuals from the other group and becomes the majority opinion group of society in equilibrium.
    Keywords: Opinion Formation, Opinion Dynamics, social network, Social Impact Model, Agent-Based Modeling, Segregation
  • Seyed Mehdi Hazrati Fard *, Elham Velayati Pages 45-53

    With the widespread using Internet in any device and services, several homes and workplace applications have been provided to avoid attacks. Connecting a system or device to an insecure network can create the possibility of being infected by unwanted files. Detecting such files is a vital task in any system. Employing machine learning (ML) is the most efficient method to detect these penetrations. On the other hand, malware programmers try to design malicious files that are hard to detect. A file can hide from detection in a feature view, but concealing in all views would be very difficult. In this paper, inspiring Multi-View Learning (MVL), we proposed to incorporate some various features such as Opcodes, Bytecodes, and System-calls to achieve complementary information to identify a file. In this way, we developed a modified version of Sparse Representation based Classifier (SRC) to aggregate the effect of all modalities in a unified classifier. To show the efficiency of the proposed method, we used several real datasets. Experimental results show the high performance of the proposed approach and its ability to cope with the imbalanced conditions.

    Keywords: Multiview Learning, Sparse representation, Malware Detection, Malware Identification, Imbalanced Condition