Alfaro-García, VG, Merigó, JM, Gil-Lafuente, AM & Kacprzyk, J 2018, 'Logarithmic aggregation operators and distance measures', International Journal of Intelligent Systems, vol. 33, no. 7, pp. 1488-1506.
View/Download from: Publisher's site
View description>>
© 2018 Wiley Periodicals, Inc. The Hamming distance is a well-known measure that is designed to provide insights into the similarity between two strings of information. In this study, we use the Hamming distance, the optimal deviation model, and the generalized ordered weighted logarithmic averaging (GOWLA) operator to develop the ordered weighted logarithmic averaging distance (OWLAD) operator and the generalized ordered weighted logarithmic averaging distance (GOWLAD) operator. The main advantage of these operators is the possibility of modeling a wider range of complex representations of problems under the assumption of an ideal possibility. We study the main properties, alternative formulations, and families of the proposed operators. We analyze multiple classical measures to characterize the weighting vector and propose alternatives to deal with the logarithmic properties of the operators. Furthermore, we present generalizations of the operators, which are obtained by studying their weighting vectors and the lambda parameter. Finally, an illustrative example regarding innovation project management measurement is proposed, in which a multi-expert analysis and several of the newly introduced operators are utilized.
Ali, AR, Gabrys, B & Budka, M 2018, 'Cross-domain Meta-learning for Time-series Forecasting', Procedia Computer Science, vol. 126, pp. 9-18.
View/Download from: Publisher's site
View description>>
© 2018 The Author(s). There are many algorithms that can be used for the time-series forecasting problem, ranging from simple (e.g. Moving Average) to sophisticated Machine Learning approaches (e.g. Neural Networks). Most of these algorithms require a number of user-defined parameters to be specified, leading to exponential explosion of the space of potential solutionS. since the trial-and-error approach to finding a good algorithm for solving a given problem is typically intractable, reSearchers and practitioners need to resort to a more intelligent Search strategy, with one option being to constraint the Search space using past experience - an approach known as Meta-learning. Although potentially attractive, Meta-learning comes with its own challengeS. Gathering a sufficient number of Meta-examples, which in turn requires collecting and processing multiple datasets from each problem domain under consideration is perhaps the most prominent issue. In this paper, we are investigating the situations in which the use of additional data can improve performance of a Meta-learning System, with focus on cross-domain transfer of Meta-knowledge. A similarity-based cluster analysis of Meta-features has also been performed in an attempt to discover homogeneous groups of time-series with respect to Meta-learning performance. Although the experiments revealed limited room for improvement over the overall best base-learner, the Meta-learning approach turned out to be a safe choice, minimizing the risk of selecting the least appropriate base-learner.
Alshehri, MD, Hussain, FK & Hussain, OK 2018, 'Clustering-Driven Intelligent Trust Management Methodology for the Internet of Things (CITM-IoT)', Mobile Networks and Applications, vol. 23, no. 3, pp. 419-431.
View/Download from: Publisher's site
View description>>
© 2018, Springer Science+Business Media, LLC, part of Springer Nature. The growth and adoption of the Internet of Things (IoT) is increasing day by day. The large number of IoT devices increases the risk of security threats such as (but not limited to) viruses or cyber-attacks. One possible approach to achieve IoT security is to enable a trustworthy IoT environment in IoT wherein the interactions are based on the trust value of the communicating nodes. Trust management and trust assessment has been extensively studied in distributed networks in general and the IoT in particular, but there are still outstanding pressing issues such as bad-mouthing of trust values which prevent them from being used in practical IoT applications. Furthermore, there is no research in ensuring that the developed IoT trust solutions are scalable across billions of IoT nodes. To address the above-mentioned issues, we propose a methodology for scalable trust management solution in the IoT. The methodology addresses practical and pressing issues related to IoT trust management such as trust-based IoT clustering, intelligent methods for countering bad-mouthing attacks on trust systems, issues of memory-efficient trust computation and trust-based migration of IoT nodes from one cluster to another. Experimental results demonstrate the effectiveness of the proposed approaches.
Andrade-Valbuena, NA & Merigo, JM 2018, 'Outlining new product development research through bibliometrics', Journal of Strategy and Management, vol. 11, no. 3, pp. 328-350.
View/Download from: Publisher's site
View description>>
PurposeNew product development (NPD) is a noteworthy field that has attracted the attention of scholars for its relevance for firm success. Based on bibliometric indicators and spatial distance network analysis, the authors outline the general structure overview of NPD research through the last 40 years of scientific production; identify and categorize key articles, authors, journals, institutions, and countries related to NPD research; identify and map the research subareas that have mostly contributed to the construction of NPD intellectual structure. The paper aims to discuss these issues.Design/methodology/approachThe work uses the Web of Science Core Collection and the visualization of similarities viewer software. The analysis searches for all the documents connected to NPD available in the database. The graphical visualization maps the bibliographic data in terms of bibliographic coupling and co-citation.FindingsThe general NPD citation pattern evidences a construction of knowledge and learning, as evidenced in different subjects, such as biology or physics. Relevant contributions and contributors are highlighted as journals, articles, researchers, countries and institutions in overall NPD research and in its constituent subfields. Five subareas related to the NPD field based on journals and authors network are identified: marketing; operations and production; strategy; industrial engineering and operations; and management.Originality/valueThis paper contributes to the NPD literature by offering a global perspective on the field by using bibliometric data graphical networks, provid...
Avilés-Ochoa, E, León-Castro, E, Perez-Arellano, LA & Merigó, JM 2018, 'Government transparency measurement through prioritized distance operators', Journal of Intelligent & Fuzzy Systems, vol. 34, no. 4, pp. 2783-2794.
View/Download from: Publisher's site
View description>>
© 2018 - IOS Press and the authors. All rights reserved. The prioritized induced probabilistic ordered weighted average distance (PIPOWAD) has been developed. This new operator is an extension of the ordered weighted average (OWA) operator that can be used in cases where we have two sets of data that want to be compared. Some of the main characteristics of this new operator are: 1) Not all the decision makers are equally important, so the information needs to be prioritized, 2) The information has a probability to occur and 3) The decision makers can change the importance of the information based in an induced variable. Additionally, characteristics and families of the PIPOWAD operator are presented. Finally, an application of the PIPOWAD operator in order to measure government transparency in Mexico is presented.
Baier-Fuentes, H, Cascón-Katchadourian, J, Sánchez, ÁM, Herrera-Viedma, E & Merigó, J 2018, 'A Bibliometric Overview of the International Journal of Interactive Multimedia and Artificial Intelligence', International Journal of Interactive Multimedia and Artificial Intelligence, vol. 5, no. 3, pp. 9-9.
View/Download from: Publisher's site
Bargi, A, Xu, RYD & Piccardi, M 2018, 'AdOn HDP-HMM: An Adaptive Online Model for Segmentation and Classification of Sequential Data', IEEE Transactions on Neural Networks and Learning Systems, vol. 29, no. 9, pp. 3953-3968.
View/Download from: Publisher's site
View description>>
© 2012 IEEE. Recent years have witnessed an increasing need for the automated classification of sequential data, such as activities of daily living, social media interactions, financial series, and others. With the continuous flow of new data, it is critical to classify the observations on-the-fly and without being limited by a predetermined number of classes. In addition, a model should be able to update its parameters in response to a possible evolution in the distributions of the classes. This compelling problem, however, does not seem to have been adequately addressed in the literature, since most studies focus on offline classification over predefined class sets. In this paper, we present a principled solution for this problem based on an adaptive online system leveraging Markov switching models and hierarchical Dirichlet process priors. This adaptive online approach is capable of classifying the sequential data over an unlimited number of classes while meeting the memory and delay constraints typical of streaming contexts. In this paper, we introduce an adaptive 'learning rate' that is responsible for balancing the extent to which the model retains its previous parameters or adapts to new observations. Experimental results on stationary and evolving synthetic data and two video data sets, TUM Assistive Kitchen and collated Weizmann, show a remarkable performance in terms of segmentation and classification, particularly for sequences from evolutionary distributions and/or those containing previously unseen classes.
Beck, D, Thoms, JAI, Palu, C, Herold, T, Shah, A, Olivier, J, Boelen, L, Huang, Y, Chacon, D, Brown, A, Babic, M, Hahn, C, Perugini, M, Zhou, X, Huntly, BJ, Schwarzer, A, Klusmann, J-H, Berdel, WE, Wörmann, B, Büchner, T, Hiddemann, W, Bohlander, SK, To, LB, Scott, HS, Lewis, ID, D'Andrea, RJ, Wong, JWH & Pimanda, JE 2018, 'A four-gene LincRNA expression signature predicts risk in multiple cohorts of acute myeloid leukemia patients', Leukemia, vol. 32, no. 2, pp. 263-272.
View/Download from: Publisher's site
View description>>
© 2018 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. Prognostic gene expression signatures have been proposed as clinical tools to clarify therapeutic options in acute myeloid leukemia (AML). However, these signatures rely on measuring large numbers of genes and often perform poorly when applied to independent cohorts or those with older patients. Long intergenic non-coding RNAs (lincRNAs) are emerging as important regulators of cell identity and oncogenesis, but knowledge of their utility as prognostic markers in AML is limited. Here we analyze transcriptomic data from multiple cohorts of clinically annotated AML patients and report that (i) microarrays designed for coding gene expression can be repurposed to yield robust lincRNA expression data, (ii) some lincRNA genes are located in close proximity to hematopoietic coding genes and show strong expression correlations in AML, (iii) lincRNA gene expression patterns distinguish cytogenetic and molecular subtypes of AML, (iv) lincRNA signatures composed of three or four genes are independent predictors of clinical outcome and further dichotomize survival in European Leukemia Net (ELN) risk groups and (v) an analytical tool based on logistic regression analysis of quantitative PCR measurement of four lincRNA genes (LINC4) can be used to determine risk in AML.
Blanco-Mesa, F, Gil-Lafuente, AM & Merigo, JM 2018, 'Dynamics of stakeholder relations with multi-person aggregation', Kybernetes, vol. 47, no. 9, pp. 1801-1820.
View/Download from: Publisher's site
View description>>
PurposeThe purpose of this paper is to develop a novel method to analyse dynamic interactions of stakeholders to explain how a set of agents can act by considering the power/influence positions.Design/methodology/approachA novel mathematical application uses the importance of characteristics algorithm in combination with composition max-min to compare, group and order information according to the importance of its characteristics. The mathematical application is focused on a strategic analysis, evaluating stakeholder dynamics through power relationships.FindingsThe results show a comparison of the relationships among each of the stakeholders to obtain the relative intensity and importance of relationships between them, given by the fuzzy matrix FRInM and the fuzzy matrix FRIM, respectively. This application provides a useful tool for a dynamic analysis of stakeholders in a complex environment, where the best approach to performing a strategic analysis process is sought.Research limitations/implicationsThe main implication of the proposed approach is taking into account the importance of information to establish the boundaries and relationships of each characteristic according to its intensity. However, limitations are due to the nature of this research, based on theoretical assumptions regarding stakeholders and the use of a hypothetical example to show the operation of algorithms.Originality/valueThe primary advantage of this proposition is that it takes into account the im...
BLANCO-MESA, F, GIL-LAFUENTE, AM & MERIGÓ, JM 2018, 'NEW AGGREGATION OPERATORS FOR DECISION-MAKING UNDER UNCERTAINTY: AN APPLICATIONS IN SELECTION OF ENTREPRENEURIAL OPPORTUNITIES', Technological and Economic Development of Economy, vol. 24, no. 2, pp. 335-357.
View/Download from: Publisher's site
View description>>
The main aim of this paper is to study how economic environment and logic reasoning guidance the decision-making process to start-up a new business by potential entrepreneurs. The study proposes a new method using the family of selection indices with OWA operator, which allows aggregating information according to the level of importance and their level of objectivity and subjectivity in the same formulation within the decision-making process. To develop case study, we have taken into account some industries of the sports sector and some critical environmental factors that influence the competitiveness and entrepreneurship in Colombia to start a new business. The results show in an orderly way all information aggregated, which can help potential investors and entrepreneurs to make a decision based on their preferences. Finally, the applicability of this method in real case can be given in aggregation different sources of information to help at dealing decision-making processes.
Blanco-Mesa, F, Gil-Lafuente, AM & Merigó, JM 2018, 'Subjective stakeholder dynamics relationships treatment: a methodological approach using fuzzy decision-making', Computational and Mathematical Organization Theory, vol. 24, no. 4, pp. 441-472.
View/Download from: Publisher's site
View description>>
© 2018, Springer Science+Business Media, LLC, part of Springer Nature. Since the stakeholder theory was proposed to explain the interaction among its agents, extensive approaches have been developed. However, the literature continues to suggest the development of new methodologies that allow an analysis of the dynamics and uncertainty of the relationships between each agent. In this sense, this research proposes a novel methodology for the treatment of subjective stakeholder dynamics using fuzzy decision-making. The study proposes a mathematical methodological perspective for the treatment of subjective relationships among stakeholders, which allows a predictive simulation tool to be developed for attitude and personal preferences to analyze the links among all stakeholders. A mathematical application is developed to help the decision-making process in uncertainty concerning the ordering-according-to-their-importance and linking-of-relation algorithms, which are based on notions of relation, gathering and ordering. A numerical example is proposed to understand the method’s usefulness and feasibility. The results approximate how stakeholder ambiguity and fuzziness can be managed considering the decision-maker’s preference subjectivity. In addition, these results highlight the different relationships among each stakeholder, their intensity levels, the incidence linkage loops and the incidence relative on stakeholder behaviors. The main implication of this proposition is to deal with the subjective preferences provide by decision-maker to better interpret environmental and subjective factors. Furthermore, this study contributes to the strategic planning and decision-making processes for operative units within uncertain environment in the short term.
Blanco-Mesa, F, León-Castro, E & Merigó, JM 2018, 'Bonferroni induced heavy operators in ERM decision-making: A case on large companies in Colombia', Applied Soft Computing, vol. 72, pp. 371-391.
View/Download from: Publisher's site
View description>>
© 2018 Elsevier B.V. Averaging aggregation operators analyse a set of data providing a summary of the results. This study focuses on the Bonferroni mean and the induced and heavy aggregation operators. The aim of the work is to present new aggregation operators that combine these concepts forming the Bonferroni induced heavy ordered weighted average and several particular formulations. This approach represents Bonferroni means with order inducing variables and with weighting vectors that can be higher than one. The paper also develops some extensions by using distance measures forming the Bonferroni induced heavy ordered weighted average distance and several particular cases. The study ends with an application in a large companies risk management problem in Colombia. The main advantage of this approach is that it provides a more general framework for analysing the data in scenarios where the numerical values may have some complexities that should be assessed with complex attitudinal characters.
Cancino, CA, Merigo, JM, Torres, JP & Diaz, D 2018, 'A bibliometric analysis of venture capital research', Journal of Economics, Finance and Administrative Science, vol. 23, no. 45, pp. 182-195.
View/Download from: Publisher's site
View description>>
PurposeThe purpose of this study is to present the evolution of academic research in venture capital (VC) research between 1990 and 2014.Design/methodology/approachThe study analyzes the most influential journals in VC research by analyzing papers, which were published on the Web of Science database.FindingsResults show a steady increasing rate of VC research during the past 25 years. The paper reports the 40 academic journals that permanently publish articles about VC research.Originality/valueThe main contribution of this work is to develop a general overview of the leading journals in VC research, which leads to the development of a future research agenda for bibliometric analysis, such as the review of the most productive and influential authors, universities and countries in VC research.
CARLES, M-F, PATRICIA, H, ANTONIO, S & JOSÉ M., M 2018, 'The Forgotten Effects: An Application in the Social Economy of Companies of the Balearic Islands', ECONOMIC COMPUTATION AND ECONOMIC CYBERNETICS STUDIES AND RESEARCH, vol. 52, no. 3/2018, pp. 147-160.
View/Download from: Publisher's site
View description>>
© 2018, Bucharest University of Economic Studies. All rights reserved. Few studies have analyzed how to improve the results and productivity of companies with very peculiar characteristics, such as social economy entities. This paper determines the principal worth-creating activities for this type of companies that dedicate their activities to the service sector of the Balearic Islands. In order to carry out this work, incidence matrixes and recovery of forgotten effects have been used. Both direct cause and second generation causes that arise in the majority of the socio-economic cases have been identified. In fact, determining the second generation effects, or forgotten effects, is one of the main contributions of this study as it shows that those causes that are usually not foreseen, at least in the first instance, affect notably in the generation of social economy companies value to the service sector of the Balearic Islands.
Chi, L, Li, B, Zhu, X, Pan, S & Chen, L 2018, 'Hashing for Adaptive Real-Time Graph Stream Classification With Concept Drifts', IEEE Transactions on Cybernetics, vol. 48, no. 5, pp. 1591-1604.
View/Download from: Publisher's site
View description>>
Many applications involve processing networked streaming data in a timely manner. Graph stream classification aims to learn a classification model from a stream of graphs with only one-pass of data, requiring real-time processing in training and prediction. This is a nontrivial task, as many existing methods require multipass of the graph stream to extract subgraph structures as features for graph classification which does not simultaneously satisfy "one-pass" and "real-time" requirements. In this paper, we propose an adaptive real-time graph stream classification method to address this challenge. We partition the unbounded graph stream data into consecutive graph chunks, each consisting of a fixed number of graphs and delivering a corresponding chunk-level classifier. We employ a random hashing function to compress the original node set of graphs in each chunk for fast feature detection when training chunk-level classifiers. Furthermore, a differential hashing strategy is applied to map unlimited increasing features (i.e., cliques) into a fixed-size feature space which is then used as a feature vector for stochastic learning. Finally, the chunk-level classifiers are weighted in an ensemble learning model for graph classification. The proposed method substantially speeds up the graph feature extraction and avoids unbounded graph feature growth. Moreover, it effectively offsets concept drifts in graph stream classification. Experiments on real-world and synthetic graph streams demonstrate that our method significantly outperforms existing methods in both classification accuracy and learning efficiency.
Chiu, SK, Saw, J, Huang, Y, Sonderegger, SE, Wong, NC, Powell, DR, Beck, D, Pimanda, JE, Tremblay, CS & Curtis, DJ 2018, 'A novel role for Lyl1 in primitive erythropoiesis', Development, vol. 145, no. 19.
View/Download from: Publisher's site
View description>>
Stem Cell Leukemia (Scl or Tal1) and Lymphoblastic Leukemia 1 (Lyl1) are highly related members of the basic helix-loop-helix (bHLH) family of transcription factors that are co- expressed in the erythroid lineage. Previous studies suggest that Scl is essential for primitive erythropoiesis. However, analysis of single-cell RNA-sequencing data of early embryos showed that primitive erythroid cells express both Scl and Lyl1. Therefore, to determine whether Lyl1 can function in primitive erythropoiesis, we crossed conditional Scl knockout mice with mice expressing a Cre recombinase under the control of the Epo receptor, active in erythroid progenitors. Embryos with 20% expression of Scl from E9.5 survived to adulthood. However, mice with reduced expression of Scl and absence of Lyl1 (double knockout; DKO) died at E10.5 due to progressive loss of erythropoiesis. Gene expression profiling of DKO yolk sacs revealed loss of Gata1 and many of the known target genes of the SCL-GATA1 complex. ChIP-seq analyses showed that LYL1 exclusively bound a small subset of SCL targets including GATA1. Together, these data show for the first time that Lyl1 can maintain primitive erythropoiesis.
Chotipant, S, Hussain, FK & Hussain, OK 2018, 'SERNOTATE: An automated approach for business service description annotation for efficient service retrieval and composition', Concurrency and Computation: Practice and Experience, vol. 30, no. 1, pp. e4189-e4189.
View/Download from: Publisher's site
View description>>
SummaryBusiness service advertisements are today published online to convey essential information about services to customers. However, current Web search engines are unable to search and combine online service advertisements. Semantic service annotation is important for its ability to enable machines to understand the meaning of services and support in effective service retrieval and service composition. Existing research in the area of semantic service annotation has focused on the annotation of Web services in a semi‐automated approach. It cannot be applied to business service information as it is not in the form of Web Services Description Language but in free text format. Moreover, semi‐automated approaches are inappropriate for annotating a large amount of online service information which changes dynamically and they are therefore not suitable for the timely dissemination of service information to customers. To solve these issues, we propose SERNOTATE, which is an automated approach for business service description annotation for efficient service retrieval and composition. We propose new semantic‐based linking approaches, namely, Extended Case‐based Reasoning, vector‐based, and classification‐based, that automatically annotate business services to relevant service concepts. Each approach assists in the single‐label and multi‐label annotation of service terms to concept terms to provide a better representation of services. The experimental results test and validate the applicability of the proposed approaches to the automatic annotation of business service descriptions to service concepts on a real‐world dataset.
Deng, Z, Chen, J, Zhang, T, Cao, L & Wang, S 2018, 'Generalized Hidden-Mapping Minimax Probability Machine for the training and reliability learning of several classical intelligent models', Information Sciences, vol. 436-437, pp. 302-319.
View/Download from: Publisher's site
View description>>
© 2018 Elsevier Inc. Minimax Probability Machine (MPM) is a binary classifier that optimizes the upper bound of the misclassification probability. This upper bound of the misclassification probability can be used as an explicit indicator to characterize the reliability of the classification model and thus makes the classification model more transparent. However, the existing related work is constrained to linear models or the corresponding nonlinear models by applying the kernel trick. To relax such constraints, we propose the Generalized Hidden-Mapping Minimax Probability Machine (GHM-MPM). GHM-MPM is a generalized MPM. It is capable of training many classical intelligent models, such as feedforward neural networks, fuzzy logic systems, and linear and kernelized linear models for classification tasks, and realizing the reliability learning of these models simultaneously. Since the GHM-MPM, similarly to the classical MPM, was originally developed only for binary classification, it is further extended to multi-class classification by using the obtained reliability indices of the binary classifiers of two arbitrary classes. The experimental results show that GHM-MPM makes the trained models more transparent and reliable than those trained by classical methods.
Dong, X, Gong, Y & Cao, L 2018, 'F-NSP+: A fast negative sequential patterns mining method with self-adaptive data storage', Pattern Recognition, vol. 84, pp. 13-27.
View/Download from: Publisher's site
View description>>
© 2018 Elsevier Ltd Mining negative sequential patterns (NSP) is an important tool for nonoccurring behavior analysis, and it is much more challenging than mining positive sequential patterns (PSPs) due to the high computational complexity and huge search space when obtaining the support of negative sequential candidates (NSCs). Very few NSP mining algorithms are available and most of them are very inefficient since they obtain the support of NSC by scanning the database repeatedly. Instead, the state-of-the-art NSP mining algorithm e-NSP only uses the PSP's information stored in an array structure to ‘calculate' the support of NSC by equations, without database re-scanning. This makes e-NSP highly efficient, particularly on sparse datasets. However, when datasets become dense, the key process to obtain the support of NSC in e-NSP becomes very time-consuming and needs to be improved. In this paper, we propose a novel and efficient data structure, a bitmap, to obtain the support of NSC. We correspondingly propose a fast NSP mining algorithm, f-NSP, which uses a bitmap to store the PSP's information and then obtain the support of NSC only by bitwise operations, which is much faster than the hash method in e-NSP. Experimental results on real-world and synthetic datasets show that f-NSP is not only tens to hundreds of times faster than e-NSP, but also saves more than ten-fold the storage spaces of e-NSP, particularly on dense datasets with a large number of elements in a sequence or a small number of itemsets. Further, we find that f-NSP consumes more storage space than e-NSP when PSP's support is less than a support threshold sdsup, a value obtained through our theoretical analysis of storage space. Accordingly, we propose a self-adaptive storage strategy and a corresponding algorithm f-NSP+ to overcome this deficiency. f-NSP+ can automatically choose a bitmap or an array structure to store PSP information according to PSP support. Experimental results sho...
Durán Santomil, P, Otero González, L, Martorell Cunill, O & Merigó Lindahl, JM 2018, 'Backtesting an equity risk model under Solvency II', Journal of Business Research, vol. 89, pp. 216-222.
View/Download from: Publisher's site
View description>>
© 2018 Elsevier Inc. Backtesting is a technique for validating internal models under Solvency II, which allows for evaluating the discrepancies between the results provided by a model and real observations. This paper aims to establish various backtesting tests and to show their applications to equity risk in Solvency II. Normal and empirical models with a rolling window are used to determine VaR at the 99.5% confidence level over a one-year time horizon. The proposed methodology performs the backtesting of annualized returns arising from the accumulation of daily returns. The results show that even if a model is conservative when tested out of a sample, it may be inadequate when evaluated in a sample, thereby highlighting the problems inherent in the out-of-sample backtesting proposed by the regulator.
Engemann, KJ, Merigó, JM, Terceño, A & Yager, RR 2018, 'Foreword', International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, vol. 26, no. Suppl. 1, pp. v-vii.
View/Download from: Publisher's site
Esmaili, N, Piccardi, M, Kruger, B & Girosi, F 2018, 'Analysis of healthcare service utilization after transport-related injuries by a mixture of hidden Markov models', PLOS ONE, vol. 13, no. 11, pp. e0206274-e0206274.
View/Download from: Publisher's site
View description>>
© 2018 Esmaili et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Background Transport injuries commonly result in significant disease burden, leading to physical disability, mental health deterioration and reduced quality of life. Analyzing the patterns of healthcare service utilization after transport injuries can provide an insight into the health of the affected parties, allow improved health system resource planning, and provide a baseline against which any future system-level interventions can be evaluated. Therefore, this research aims to use time series of service utilization provided by a compensation agency to identify groups of claimants with similar utilization patterns, describe such patterns, and characterize the groups in terms of demographic, accident type and injury type. Methods To achieve this aim, we have proposed an analytical framework that utilizes latent variables to describe the utilization patterns over time and group the claimants into clusters based on their service utilization time series. To perform the clustering without dismissing the temporal dimension of the time series, we have used a well-established statistical approach known as the mixture of hidden Markov models (MHMM). Ensuing the clustering, we have applied multinomial logistic regression to provide a description of the clusters against demographic, injury and accident covariates. Results We have tested our model with data on psychology service utilization from one of the main compensation agencies for transport accidents in Australia, and found that three clear clusters of service utilization can be evinced from the data. These three clusters correspond to claimants who have tended to use the services 1) only briefly after the accident; 2) for an intermediate period of time...
Feng, X, Wan, W, Xu, RYD, Chen, H, Li, P & Sánchez, JA 2018, 'A perceptual quality metric for 3D triangle meshes based on spatial pooling', Frontiers of Computer Science, vol. 12, no. 4, pp. 798-812.
View/Download from: Publisher's site
Feng, X, Wan, W, Xu, RYD, Perry, S, Zhu, S & Liu, Z 2018, 'A new mesh visual quality metric using saliency weighting-based pooling strategy', Graphical Models, vol. 99, pp. 1-12.
View/Download from: Publisher's site
View description>>
© 2018 Elsevier Inc. Several metrics have been proposed to assess the visual quality of 3D triangular meshes during the last decade. In this paper, we propose a mesh visual quality metric by integrating mesh saliency into mesh visual quality assessment. We use the Tensor-based Perceptual Distance Measure metric to estimate the local distortions for the mesh, and pool local distortions into a quality score using a saliency weighting-based pooling strategy. Three well-known mesh saliency detection methods are used to demonstrate the superiority and effectiveness of our metric. Experimental results show that our metric with any of three saliency maps performs better than state-of-the-art metrics on the LIRIS/EPFL general-purpose database. We generate a synthetic saliency map by assembling salient regions from individual saliency maps. Experimental results reveal that the synthetic saliency map achieves better performance than individual saliency maps, and the performance gain is closely correlated with the similarity between the individual saliency maps.
Feng, X, Wan, W, Yi Da Xu, R, Perry, S, Li, P & Zhu, S 2018, 'A novel spatial pooling method for 3D mesh quality assessment based on percentile weighting strategy', Computers & Graphics, vol. 74, pp. 12-22.
View/Download from: Publisher's site
Gaviria-Marin, M, Merigo, JM & Popa, S 2018, 'Twenty years of theJournal of Knowledge Management: a bibliometric analysis', Journal of Knowledge Management, vol. 22, no. 8, pp. 1655-1687.
View/Download from: Publisher's site
View description>>
PurposeIn 2017, theJournal of Knowledge Management(JKM) celebrates its 20th anniversary. This study aims to show an updated analysis of their publications to provide a general overview of the journal, focusing on a bibliometric analysis of its publications between 1997 and 2016.Design/methodology/approachThe methodology involves two procedures: a performance analysis and a science mapping analysis of JKM. The performance analysis uses a series of bibliometric indicators such ash-index, productivity and citations. This analysis considers different dimensions, including papers, authors, universities and countries. VOSviewer software is used to carry out the mapping of science of JKM, which, based on the concurrence of key words and co-citation points of view, seeks to graphically analyze the structure of the references of this journal.FindingsThere is a positive evolution in the number of publications (although with certain oscillations), which shows a growing interest in publishing in JKM. The USA and the UK lead the publications in this journal, although at a regional level, Europe is the most productive. The low participation of emerging economies in JKM is also observed.Practical implicationsThe paper will identify the leading trends in the journal in terms of papers, authors, institutions, countries, journals and keywords. This study is useful for obtaining a quick snapshot of what is happening in the journal.Originality/valueFrom the h...
Gheisari, S, Catchpoole, D, Charlton, A, Melegh, Z, Gradhand, E & Kennedy, P 2018, 'Computer Aided Classification of Neuroblastoma Histological Images Using Scale Invariant Feature Transform with Feature Encoding', Diagnostics, vol. 8, no. 3, pp. 56-56.
View/Download from: Publisher's site
View description>>
Neuroblastoma is the most common extracranial solid malignancy in early childhood. Optimal management of neuroblastoma depends on many factors, including histopathological classification. Although histopathology study is considered the gold standard for classification of neuroblastoma histological images, computers can help to extract many more features some of which may not be recognizable by human eyes. This paper, proposes a combination of Scale Invariant Feature Transform with feature encoding algorithm to extract highly discriminative features. Then, distinctive image features are classified by Support Vector Machine classifier into five clinically relevant classes. The advantage of our model is extracting features which are more robust to scale variation compared to the Patched Completed Local Binary Pattern and Completed Local Binary Pattern methods. We gathered a database of 1043 histologic images of neuroblastic tumours classified into five subtypes. Our approach identified features that outperformed the state-of-the-art on both our neuroblastoma dataset and a benchmark breast cancer dataset. Our method shows promise for classification of neuroblastoma histological images.
Gheisari, S, Catchpoole, DR, Charlton, A & Kennedy, PJ 2018, 'Convolutional Deep Belief Network with Feature Encoding for Classification of Neuroblastoma Histological Images', Journal of Pathology Informatics, vol. 9, no. 1, pp. 17-17.
View/Download from: Publisher's site
View description>>
© 2018 Journal of Pathology Informatics. Background: Neuroblastoma is the most common extracranial solid tumor in children younger than 5 years old. Optimal management of neuroblastic tumors depends on many factors including histopathological classification. The gold standard for classification of neuroblastoma histological images is visual microscopic assessment. In this study, we propose and evaluate a deep learning approach to classify high-resolution digital images of neuroblastoma histology into five different classes determined by the Shimada classification. Subjects and Methods: We apply a combination of convolutional deep belief network (CDBN) with feature encoding algorithm that automatically classifies digital images of neuroblastoma histology into five different classes. We design a three-layer CDBN to extract high-level features from neuroblastoma histological images and combine with a feature encoding model to extract features that are highly discriminative in the classification task. The extracted features are classified into five different classes using a support vector machine classifier. Data: We constructed a dataset of 1043 neuroblastoma histological images derived from Aperio scanner from 125 patients representing different classes of neuroblastoma tumors. Results: The weighted average F-measure of 86.01% was obtained from the selected high-level features, outperforming state-of-the-art methods. Conclusion: The proposed computer-aided classification system, which uses the combination of deep architecture and feature encoding to learn high-level features, is highly effective in the classification of neuroblastoma histological images.
Goodswen, SJ, Kennedy, PJ & Ellis, JT 2018, 'A Gene-Based Positive Selection Detection Approach to Identify Vaccine Candidates Using Toxoplasma gondii as a Test Case Protozoan Pathogen', Frontiers in Genetics, vol. 9, no. AUG.
View/Download from: Publisher's site
View description>>
© 2018 Goodswen, Kennedy and Ellis. Over the last two decades, various in silico approaches have been developed and refined that attempt to identify protein and/or peptide vaccines candidates from informative signals encoded in protein sequences of a target pathogen. As to date, no signal has been identified that clearly indicates a protein will effectively contribute to a protective immune response in a host. The premise for this study is that proteins under positive selection from the immune system are more likely suitable vaccine candidates than proteins exposed to other selection pressures. Furthermore, our expectation is that protein sequence regions encoding major histocompatibility complexes (MHC) binding peptides will contain consecutive positive selection sites. Using freely available data and bioinformatic tools, we present a high-throughput approach through a pipeline that predicts positive selection sites, protein subcellular locations, and sequence locations of medium to high T-Cell MHC class I binding peptides. Positive selection sites are estimated from a sequence alignment by comparing rates of synonymous (dS) and non-synonymous (dN) substitutions among protein coding sequences of orthologous genes in a phylogeny. The main pipeline output is a list of protein vaccine candidates predicted to be naturally exposed to the immune system and containing sites under positive selection. Candidates are ranked with respect to the number of consecutive sites located on protein sequence regions encoding MHCI-binding peptides. Results are constrained by the reliability of prediction programs and quality of input data. Protein sequences from Toxoplasma gondii ME49 strain (TGME49) were used as a case study. Surface antigen (SAG), dense granules (GRA), microneme (MIC), and rhoptry (ROP) proteins are considered worthy T. gondii candidates. Given 8263 TGME49 protein sequences processed anonymously, the top 10 predicted candidates were all worthy candidates...
Gu, Y, Gu, M, Long, Y, Xu, G, Yang, Z, Zhou, J & Qu, W 2018, 'An enhanced short text categorization model with deep abundant representation', World Wide Web, vol. 21, no. 6, pp. 1705-1719.
View/Download from: Publisher's site
View description>>
© 2018, Springer Science+Business Media, LLC, part of Springer Nature. Short text categorization is a crucial issue to many applications, e.g., Information Retrieval, Question-Answering System, MRI Database Construction and so forth. Many researches focus on data sparsity and ambiguity issues in short text categorization. To tackle these issues, we propose a novel short text categorization strategy based on abundant representation, which utilizes Bi-directional Recurrent Neural Network(Bi-RNN) with Long Short-Term Memory(LSTM) and topic model to catch more contextual and semantic information. Bi-RNN enriches contextual information, and topic model discovers more latent semantic information for abundant text representation of short text. Experimental results demonstrate that the proposed model is comparable to state-of-the-art neural network models and method proposed is effective.
Han, B, Tsang, IW, Chen, L, Yu, CP & Fung, S-F 2018, 'Progressive Stochastic Learning for Noisy Labels', IEEE Transactions on Neural Networks and Learning Systems, vol. 29, no. 10, pp. 5136-5148.
View/Download from: Publisher's site
View description>>
© 2018 IEEE. Large-scale learning problems require a plethora of labels that can be efficiently collected from crowdsourcing services at low cost. However, labels annotated by crowdsourced workers are often noisy, which inevitably degrades the performance of large-scale optimizations including the prevalent stochastic gradient descent (SGD). Specifically, these noisy labels adversely affect updates of the primal variable in conventional SGD. To solve this challenge, we propose a robust SGD mechanism called progressive stochastic learning (POSTAL), which naturally integrates the learning regime of curriculum learning (CL) with the update process of vanilla SGD. Our inspiration comes from the progressive learning process of CL, namely learning from 'easy' tasks to 'complex' tasks. Through the robust learning process of CL, POSTAL aims to yield robust updates of the primal variable on an ordered label sequence, namely, from 'reliable' labels to 'noisy' labels. To realize POSTAL mechanism, we design a cluster of 'screening losses,' which sorts all labels from the reliable region to the noisy region. To sum up, POSTAL using screening losses ensures robust updates of the primal variable on reliable labels first, then on noisy labels incrementally until convergence. In theory, we derive the convergence rate of POSTAL realized by screening losses. Meanwhile, we provide the robustness analysis of representative screening losses. Experimental results on UCI1 simulated and Amazon Mechanical Turk crowdsourcing data sets show that the POSTAL using screening losses is more effective and robust than several existing baselines.1UCI is the abbreviation of University of California Irvine.
Hanh, LTM, Binh, NT & Tung, KT 2018, 'Parallel Mutant Execution Techniques in Mutation Testing Process for Simulink Models', Journal of Telecommunications and Information Technology, vol. 4, no. 2017, pp. 90-100.
View/Download from: Publisher's site
View description>>
Mutation testing – a fault-based technique for software testing – is a computationally expensive approach. One of the powerful methods to improve the performance of mutation without reducing effectiveness is to employ parallel processing, where mutants and tests are executed in parallel. This approach reduces the total time needed to accomplish the mutation analysis. This paper proposes three strategies for parallel execution of mutants on multicore machines using the Parallel Computing Toolbox (PCT) with the Matlab Distributed Computing Server. It aims to demonstrate that the computationally intensive software testing schemes, such as mutation, can be facilitated by using parallel processing. The experiments were carried out on eight different Simulink models. The results represented the efficiency of the proposed approaches in terms of execution time during the testing process.
Hao, S, Shi, C, Niu, Z & Cao, L 2018, 'Concept coupling learning for improving concept lattice-based document retrieval', Engineering Applications of Artificial Intelligence, vol. 69, pp. 65-75.
View/Download from: Publisher's site
View description>>
© 2017 Elsevier Ltd The semantic information in any document collection is critical for query understanding in information retrieval. Existing concept lattice-based retrieval systems mainly rely on the partial order relation of formal concepts to index documents. However, the methods used by these systems often ignore the explicit semantic information between the formal concepts extracted from the collection. In this paper, a concept coupling relationship analysis model is proposed to learn and aggregate the intra- and inter-concept coupling relationships. The intra-concept coupling relationship employs the common terms of formal concepts to describe the explicit semantics of formal concepts. The inter-concept coupling relationship adopts the partial order relation of formal concepts to capture the implicit dependency of formal concepts. Based on the concept coupling relationship analysis model, we propose a concept lattice-based retrieval framework. This framework represents user queries and documents in a concept space based on fuzzy formal concept analysis, utilizes a concept lattice as a semantic index to organize documents, and ranks documents with respect to the learned concept coupling relationships. Experiments are performed on the text collections acquired from the SMART information retrieval system. Compared with classic concept lattice-based retrieval methods, our proposed method achieves at least 9%, 8% and 15% improvement in terms of average MAP, IAP@11 and P@10 respectively on all the collections.
Ho, N, Peng, H, Mayoh, C, Liu, PY, Atmadibrata, B, Marshall, GM, Li, J & Liu, T 2018, 'Delineation of the frequency and boundary of chromosomal copy number variations in paediatric neuroblastoma', Cell Cycle, vol. 17, no. 6, pp. 749-758.
View/Download from: Publisher's site
Hu, L, Chen, Q, Zhao, H, Jian, S, Cao, L & Cao, J 2018, 'Neural Cross-Session Filtering: Next-Item Prediction Under Intra- and Inter-Session Context', IEEE Intelligent Systems, vol. 33, no. 6, pp. 57-67.
View/Download from: Publisher's site
View description>>
© 2018 IEEE. Classic recommender systems (RSs) often repeatedly recommend similar items to user historical profiles or recent purchases. For this, session-based RSs (SBRSs) are extensively studied in recent years. Current SBRSs often assume a rigid-order sequence, which does not fit in many real-world cases. In fact, the next-item recommendation depends on not only current session context but also historical sessions which are often neglected by current SBRSs. Accordingly, an SBRS over relaxed-order sequences with both intra- and inter-context is more pragmatic. Inspired by the successful experience in modern language modeling, we design an efficient neural architecture to model both intra- and inter-context for next item prediction.
Huang, X, Zhang, J, Wu, Q, Fan, L & Yuan, C 2018, 'A Coarse-to-Fine Algorithm for Matching and Registration in 3D Cross-Source Point Clouds', IEEE Transactions on Circuits and Systems for Video Technology, vol. 28, no. 10, pp. 2965-2977.
View/Download from: Publisher's site
View description>>
© 1991-2012 IEEE. We propose an efficient method to deal with the matching and registration problem found in cross-source point clouds captured by different types of sensors. This task is especially challenging due to the presence of density variation, scale difference, a large proportion of noise and outliers, missing data, and viewpoint variation. The proposed method has two stages: in the coarse matching stage, we use the ensemble of shape functions descriptor to select potential K regions from the candidate point clouds for the target. In the fine stage, we propose a scale embedded generative Gaussian mixture models registration method to refine the results from the coarse matching stage. Following the fine stage, both the best region and accurate camera pose relationships between the candidates and target are found. We conduct experiments in which we apply the method to two applications: one is 3D object detection and localization in street-view outdoor (LiDAR/VSFM) cross-source point clouds and the other is 3D scene matching and registration in indoor (KinectFusion/VSFM) cross-source point clouds. The experiment results show that the proposed method performs well when compared with the existing methods. It also shows that the proposed method is robust under various sensing techniques, such as LiDAR, Kinect, and RGB camera.
Huang, Y, Cao, L, Zhang, J, Pan, L & Liu, Y 2018, 'Exploring Feature Coupling and Model Coupling for Image Source Identification', IEEE Transactions on Information Forensics and Security, vol. 13, no. 12, pp. 3108-3121.
View/Download from: Publisher's site
View description>>
© 2005-2012 IEEE. Recently, there has been great interest in feature-based image source identification. Previous statistical learning-based methods usually regarded the identification process as a classification problem. They assumed the dependence of features and the dependence of models. However, the two assumptions are usually problematic because of the genuine coupling of features and models. To address the issues, in this paper, we propose a novel image source identification scheme. For the feature coupling, a coupled feature representation is adopted to analyze the coupled interaction among features. The coupling relations among features and their powers are measured with Pearson's correlations and integrated in a Taylor-like expansion manner. Regarding model coupling, a new coupled probability representation is developed. The model coupling relationships are characterized with conditional probabilities induced by the confusion matrix and then combined with the law of total probability. The experiments carried out on the Dresden image collection confirm the effectiveness of the proposed scheme. Via mining the feature coupling and model coupling, the identification accuracy can be significantly improved.
Huque, MH, Anderson, C, Walton, R, Woolford, S & Ryan, L 2018, 'Smooth individual level covariates adjustment in disease mapping', Biometrical Journal, vol. 60, no. 3, pp. 597-615.
View/Download from: Publisher's site
View description>>
AbstractSpatial models for disease mapping should ideally account for covariates measured both at individual and area levels. The newly available “indiCAR” model fits the popular conditional autoregresssive (CAR) model by accommodating both individual and group level covariates while adjusting for spatial correlation in the disease rates. This algorithm has been shown to be effective but assumes log‐linear associations between individual level covariates and outcome. In many studies, the relationship between individual level covariates and the outcome may be non‐log‐linear, and methods to track such nonlinearity between individual level covariate and outcome in spatial regression modeling are not well developed. In this paper, we propose a new algorithm, smooth‐indiCAR, to fit an extension to the popular conditional autoregresssive model that can accommodate both linear and nonlinear individual level covariate effects while adjusting for group level covariates and spatial correlation in the disease rates. In this formulation, the effect of a continuous individual level covariate is accommodated via penalized splines. We describe a two‐step estimation procedure to obtain reliable estimates of individual and group level covariate effects where both individual and group level covariate effects are estimated separately. This distributed computing framework enhances its application in the Big Data domain with a large number of individual/group level covariates. We evaluate the performance of smooth‐indiCAR through simulation. Our results indicate that the smooth‐indiCAR method provides reliable estimates of all regression and random effect parameters. We illustrate our proposed methodology with an analysis of data on neutropenia admissions in New South Wales (NSW), Australia.
Hussain, W, Hussain, FK, Saberi, M, Hussain, OK & Chang, E 2018, 'Comparing time series with machine learning-based prediction approaches for violation management in cloud SLAs', Future Generation Computer Systems, vol. 89, pp. 464-477.
View/Download from: Publisher's site
View description>>
© 2018 In cloud computing, service level agreements (SLAs) are legal agreements between a service provider and consumer that contain a list of obligations and commitments which need to be satisfied by both parties during the transaction. From a service provider's perspective, a violation of such a commitment leads to penalties in terms of money and reputation and thus has to be effectively managed. In the literature, this problem has been studied under the domain of cloud service management. One aspect required to manage cloud services after the formation of SLAs is to predict the future Quality of Service (QoS) of cloud parameters to ascertain if they lead to violations. Various approaches in the literature perform this task using different prediction approaches however none of them study the accuracy of each. However, it is important to do this as the results of each prediction approach vary according to the pattern of the input data and selecting an incorrect choice of a prediction algorithm could lead to service violation and penalties. In this paper, we test and report the accuracy of time series and machine learning-based prediction approaches. In each category, we test many different techniques and rank them according to their order of accuracy in predicting future QoS. Our analysis helps the cloud service provider to choose an appropriate prediction approach (whether time series or machine learning based) and further to utilize the best method depending on input data patterns to obtain an accurate prediction result and better manage their SLAs to avoid violation penalties.
Ji, K, Chen, Z, Sun, R, Ma, K, Yuan, Z & Xu, G 2018, 'GIST: A generative model with individual and subgroup-based topics for group recommendation', Expert Systems with Applications, vol. 94, pp. 81-93.
View/Download from: Publisher's site
View description>>
© 2017 Elsevier Ltd In this paper, a Topic-based probabilistic model named GIST is proposed to infer group activities, and make group recommendations. Compared with existing individual-based aggregation methods, it not only considers individual members’ interest, but also consider some subgroups’ interest. Intuition might seem that when a group of users want to take part in an activity, not every group member is decisive, instead, more likely the subgroups of members having close relationships lead to the final activity decision. That motivates our study on jointly considering individual members’ choices and subgroups’ choices for group recommendations. Based on this, our model uses two kinds of unshared topics to model individual members’ interest and subgroups’ interest separately, and then make final recommendations according to the choices from the two aspects with a weight-based scheme. Moreover, the link information in the graph topology of the groups can be used to optimize the weights of our model. The experimental results on real-life data show that the recommendation accuracy is significantly improved by GIST comparing with the state-of-the-art methods.
Jian, S, Cao, L, Lu, K & Gao, H 2018, 'Unsupervised Coupled Metric Similarity for Non-IID Categorical Data', IEEE Transactions on Knowledge and Data Engineering, vol. 30, no. 9, pp. 1810-1823.
View/Download from: Publisher's site
View description>>
© 1989-2012 IEEE. Appropriate similarity measures always play a critical role in data analytics, learning, and processing. Measuring the intrinsic similarity of categorical data for unsupervised learning has not been substantially addressed, and even less effort has been made for the similarity analysis of categorical data that is not independent and identically distributed (non-IID). In this work, a Coupled Metric Similarity (CMS) is defined for unsupervised learning which flexibly captures the value-to-attribute-to-object heterogeneous coupling relationships. CMS learns the similarities in terms of intrinsic heterogeneous intra-and inter-attribute couplings and attribute-to-object couplings in categorical data. The CMS validity is guaranteed by satisfying metric properties and conditions, and CMS can flexibly adapt to IID to non-IID data. CMS is incorporated into spectral clustering and k-modes clustering and compared with relevant state-of-the-art similarity measures that are not necessarily metrics. The experimental results and theoretical analysis show the CMS effectiveness of capturing independent and coupled data characteristics, which significantly outperforms other similarity measures on most datasets.
Jing, D, Huang, Y, Liu, X, Sia, KCS, Zhang, JC, Tai, X, Wang, M, Toscan, CE, McCalmont, H, Evans, K, Mayoh, C, Poulos, RC, Span, M, Mi, J, Zhang, C, Wong, JWH, Beck, D, Pimanda, JE & Lock, RB 2018, 'Lymphocyte-Specific Chromatin Accessibility Pre-determines Glucocorticoid Resistance in Acute Lymphoblastic Leukemia', Cancer Cell, vol. 34, no. 6, pp. 906-921.e8.
View/Download from: Publisher's site
View description>>
© 2018 Elsevier Inc. Glucocorticoids play a critical role in the treatment of lymphoid malignancies. While glucocorticoid efficacy can be largely attributed to lymphocyte-specific apoptosis, its molecular basis remains elusive. Here, we studied genome-wide lymphocyte-specific open chromatin domains (LSOs), and integrated LSOs with glucocorticoid-induced RNA transcription and chromatin modulation using an in vivo patient-derived xenograft model of acute lymphoblastic leukemia (ALL). This led to the identification of LSOs critical for glucocorticoid-induced apoptosis. Glucocorticoid receptor cooperated with CTCF at these LSOs to mediate DNA looping, which was inhibited by increased DNA methylation in glucocorticoid-resistant ALL and non-lymphoid cell types. Our study demonstrates that lymphocyte-specific epigenetic modifications pre-determine glucocorticoid resistance in ALL and may account for the lack of glucocorticoid sensitivity in other cell types. Jing et al. identified lymphocyte-specific open chromatin domains (LSOs) critical for glucocorticoid (GC)-induced acute lymphoblastic leukemia (ALL) apoptosis. GC receptor cooperated with CTCF at these LSOs to mediate DNA looping, which was inhibited by DNA methylation in GC-resistant ALL and non-lymphoid cell types.
Kendrick, L, Musial, K & Gabrys, B 2018, 'Change point detection in social networks—Critical review with experiments', Computer Science Review, vol. 29, pp. 1-13.
View/Download from: Publisher's site
View description>>
© 2018 Elsevier Inc. Change point detection in social networks is an important element in developing the understanding of dynamic systems. This complex and growing area of research has no clear guidelines on what methods to use or in which circumstances. This paper critically discusses several possible network metrics to be used for a change point detection problem and conducts an experimental, comparative analysis using the Enron and MIT networks. Bayesian change point detection analysis is conducted on different global graph metrics (Size, Density, Average Clustering Coefficient, Average Shortest Path) as well as metrics derived from the Hierarchical and Block models (Entropy, Edge Probability, No. of Communities, Hierarchy Level Membership). The results produced the posterior probability of a change point at weekly time intervals that were analysed against ground truth change points using precision and recall measures. Results suggest that computationally heavy generative models offer only slightly better results compared to some of the global graph metrics. The simplest metrics used in the experiments, i.e. nodes and links numbers, are the recommended choice for detecting overall structural changes.
Khuat, TT & Le, MH 2018, 'A Novel Hybrid ABC-PSO Algorithm for Effort Estimation of Software Projects Using Agile Methodologies', Journal of Intelligent Systems, vol. 27, no. 3, pp. 489-506.
View/Download from: Publisher's site
View description>>
AbstractIn modern software development processes, software effort estimation plays a crucial role. The success or failure of projects depends greatly on the accuracy of effort estimation and schedule results. Many studies focused on proposing novel models to enhance the accuracy of predicted results; however, the question of accurate estimation of effort has been a challenging issue with regards to researchers and practitioners, especially when it comes to projects using agile methodologies. This study aims at introducing a novel formula based on team velocity and story point factors. The parameters of this formula are then optimized by employing swarm optimization algorithms. We also propose an improved algorithm combining the advantages of the artificial bee colony and particle swarm optimization algorithms. The experimental results indicated that our approaches outperformed methods in other studies in terms of the accuracy of predicted results.
Kieu, LM, Ou, Y & Cai, C 2018, 'Large-scale transit market segmentation with spatial-behavioural features', Transportation Research Part C: Emerging Technologies, vol. 90, pp. 97-113.
View/Download from: Publisher's site
View description>>
Transit market segmentation enables transit providers to comprehend the commonalities and heterogeneities among different groups of passengers, so that they can cater for individual transit riders’ mobility needs. The problem has recently been attracting a great interest with the proliferation of automated data collection systems such as Smart Card Automated Fare Collection (AFC), which allow researchers to observe individual travel behaviours over a long time period. However, there is a need for an integrated market segmentation method that incorporating both spatial and behavioural features of individual transit passengers. This algorithm also needs to be efficient for large-scale implementation. This paper proposes a new algorithm named Spatial Affinity Propagation (SAP) based on the classical Affinity Propagation algorithm (AP) to enable large-scale spatial transit market segmentation with spatial-behavioural features. SAP segments transit passengers using spatial geodetic coordinates, where passengers from the same segment are located within immediate walking distance; and using behavioural features mined from AFC data. The comparison with AP and popular algorithms in literature shows that SAP provides nearly as good clustering performance as AP while being 52% more efficient in computation time. This efficient framework would enable transit operators to leverage the availability of AFC data to understand the commonalities and heterogeneities among different groups of passengers.
Kusakunniran, W, Wu, Q, Ritthipravat, P & Zhang, J 2018, 'Hard exudates segmentation based on learned initial seeds and iterative graph cut', Computer Methods and Programs in Biomedicine, vol. 158, pp. 173-183.
View/Download from: Publisher's site
View description>>
© 2018 Elsevier B.V. (Background and Objective): The occurrence of hard exudates is one of the early signs of diabetic retinopathy which is one of the leading causes of the blindness. Many patients with diabetic retinopathy lose their vision because of the late detection of the disease. Thus, this paper is to propose a novel method of hard exudates segmentation in retinal images in an automatic way. (Methods): The existing methods are based on either supervised or unsupervised learning techniques. In addition, the learned segmentation models may often cause miss-detection and/or fault-detection of hard exudates, due to the lack of rich characteristics, the intra-variations, and the similarity with other components in the retinal image. Thus, in this paper, the supervised learning based on the multilayer perceptron (MLP) is only used to identify initial seeds with high confidences to be hard exudates. Then, the segmentation is finalized by unsupervised learning based on the iterative graph cut (GC) using clusters of initial seeds. Also, in order to reduce color intra-variations of hard exudates in different retinal images, the color transfer (CT) is applied to normalize their color information, in the pre-processing step. (Results): The experiments and comparisons with the other existing methods are based on the two well-known datasets, e_ophtha EX and DIARETDB1. It can be seen that the proposed method outperforms the other existing methods in the literature, with the sensitivity in the pixel-level of 0.891 for the DIARETDB1 dataset and 0.564 for the e_ophtha EX dataset. The cross datasets validation where the training process is performed on one dataset and the testing process is performed on another dataset is also evaluated in this paper, in order to illustrate the robustness of the proposed method. (Conclusions): This newly proposed method integrates the supervised learning and unsupervised learning based techniques. It achieves the improved performa...
Laengle, S, Modak, NM, Merigó, JM & De La Sotta, C 2018, 'Thirty years of the International Journal of Computer Integrated Manufacturing: a bibliometric analysis', International Journal of Computer Integrated Manufacturing, vol. 31, no. 12, pp. 1247-1268.
View/Download from: Publisher's site
View description>>
© 2018, © 2018 Informa UK Limited, trading as Taylor & Francis Group. The International Journal of Computer Integrated Manufacturing was established in 1988 with the idea of advancing research in computer integrated manufacturing (CIM) technologies and promoting the application of those technologies within industry. The journal was created to facilitate the exchange of new knowledge between industry and academia derived from both research and practical application. To celebrate the 30-year journey of the journal, this study develops a bibliometric analysis of all the publications of the journal to 2017. Information was collected using the Web of Science Core Collection database. The present study has been conducted to highlight the significant contributions of the journal in terms of impact, topics, authors, universities and countries. Finally, visualisation of similarities (VOS) viewer software was used to present graphical representations of the bibliographic coupling, co-citation, citation, co-authorship and co-occurrence of keywords.
Laengle, S, Modak, NM, Merigo, JM & Zurita, G 2018, 'Twenty-Five Years of Group Decision and Negotiation: A Bibliometric Overview', Group Decision and Negotiation, vol. 27, no. 4, pp. 505-542.
View/Download from: Publisher's site
View description>>
© 2018, Springer Science+Business Media B.V., part of Springer Nature. Twenty-five years ago, in 1992, a journal named Group Decision and Negotiation was established in association with the Institute for Operations Research and the Management Sciences with the vision of promoting theoretical and empirical research, real-world applications and case studies on group decision and negotiation processes. To celebrate its 25 years of continuous and outstanding contributions, this study aims to develop a bibliometric analysis of the publications of the journal between 1992 and 2016. The Web of Science Core Collection database is used to identify the leading trends of the journal in terms of impacts, topics, authors, universities and countries. Moreover, it utilizes the visualization of similarities viewer software to analyze the bibliographic couplings, co-citations, citations, co-authorships and co-occurrences of keywords.
Lan, C, Peng, H, McGowan, EM, Hutvagner, G & Li, J 2018, 'An isomiR expression panel based novel breast cancer classification approach using improved mutual information', BMC Medical Genomics, vol. 11, no. S6, pp. 118-118.
View/Download from: Publisher's site
View description>>
BACKGROUND:Gene expression-based profiling has been used to identify biomarkers for different breast cancer subtypes. However, this technique has many limitations. IsomiRs are isoforms of miRNAs that have critical roles in many biological processes and have been successfully used to distinguish various cancer types. Biomarker isomiRs for identifying different breast cancer subtypes has not been investigated. For the first time, we aim to show that isomiRs are better performing biomarkers and use them to explain molecular differences between breast cancer subtypes. RESULTS:In this study, a novel method is proposed to identify specific isomiRs that faithfully classify breast cancer subtypes. First, as a null hypothesis method we removed the lowly expressed isomiRs from small sequencing data generated from diverse breast cancers types. Second, we developed an improved mutual information-based feature selection method to calculate the weight of each isomiR expression. The weight of isomiR measures the importance of a given isomiR in classifying breast cancer subtypes. The improved mutual information enables to apply the dataset in which the feature is continuous data and label is discrete data; whereby, the traditional mutual information cannot be applied in this dataset. Finally, the support vector machine (SVM) classifier is applied to find isomiR biomarkers for subtyping. CONCLUSIONS:Here we demonstrate that isomiRs can be used as biomarkers in the identification of different breast cancer subtypes, and in addition, they may provide new insights into the diverse molecular mechanisms of breast cancers. We have also shown that the classification of different subtypes of breast cancer based on isomiRs expression is more effective than using published gene expression profiling. The proposed method provides a better performance outcome than Fisher method and Hellinger method for discovering biomarkers to distinguish different breast cancer subtypes. This novel techniqu...
León-Castro, E, Avilés-Ochoa, E & Merigó, JM 2018, 'Induced Heavy Moving Averages', International Journal of Intelligent Systems, vol. 33, no. 9, pp. 1823-1839.
View/Download from: Publisher's site
León-Castro, E, Avilés-Ochoa, E, Merigó, JM & Gil-Lafuente, AM 2018, 'Heavy Moving Averages and Their Application in Econometric Forecasting', Cybernetics and Systems, vol. 49, no. 1, pp. 26-43.
View/Download from: Publisher's site
View description>>
© 2017 Taylor & Francis Group, LLC. This paper presents the heavy ordered weighted moving average (HOWMA) operator. It is an aggregation operator that uses the main characteristics of two well-known techniques: the heavy ordered weighted averaging (OWA) and the moving averages. Therefore, this operator provides a parameterized family of aggregation operators from the minimum to the total operator and includes the OWA operator as a special case. It uses a heavy weighting vector in the moving average formulation and it represents the information available and the knowledge of the decision maker about the future scenarios of the phenomenon, according to his attitudinal character. Some of the main properties of this operator are studied, including a wide range of families of HOWMA operators such as the heavy moving average and heavy weighted moving average operators. The HOWMA operator is also extended using generalized and quasi-arithmetic means. An example concerning the foreign exchange rate between US dollars and Mexican pesos is also presented.
Li, L, Liu, J, Sun, Y, Xu, G, Yuan, J & Zhong, L 2018, 'Unsupervised keyword extraction from microblog posts via hashtags', Journal of Web Engineering, vol. 17, no. 1-2, pp. 93-120.
View description>>
Nowadays, huge amounts of texts are being generated for social networking purposes on Web. Keyword extraction from such texts like microblog posts benefits many applications such as advertising, search, and content filtering. Unlike traditional web pages, a microblog post usually has some special social feature like a hashtag that is topical in nature and generated by users. Extracting keywords related to hashtags can reflect the intents of users and thus provides us better understanding on post content. In this paper, we propose a novel unsupervised keyword extraction approach for microblog posts by treating hashtags as topical indicators. Our approach consists of two hashtag enhanced algorithms. One is a topic model algorithm that infers topic distributions biased to hashtags on a collection of microblog posts. The words are ranked by their average topic probabilities. Our topic model algorithm can not only find the topics of a collection, but also extract hashtag-related keywords. The other is a random walk based algorithm. It first builds a word-post weighted graph by taking into account posts themselves. Then, a hashtag biased random walk is applied on this graph, which guides the algorithm to extract keywords according to hashtag topics. Last, the final ranking score of a word is determined by the stationary probability after a number of iterations. We evaluate our proposed approach on a collection of real Chinese microblog posts. Experiments show that our approach is more effective in terms of precision than traditional approaches considering no hashtag. The result achieved by the combination of two algorithms performs even better than each individual algorithm.
Li, L, Liu, Z & Zhang, J 2018, 'Unsupervised image co-segmentation via guidance of simple images', Neurocomputing, vol. 275, pp. 1650-1661.
View/Download from: Publisher's site
View description>>
© 2017 Elsevier B.V. This paper proposes a novel image co-segmentation method, which aims to segment the common objects in a group of images. The proposed method takes advantages of the reliability of simple images and successfully improves the performance. The images are first ranked by the complexities based on their saliency maps. Then, the simple images, in which objects are common and easy to be segmented, are selected and processed to obtain their segmentation results, these segmentation results are taken as the samples of the targeted objects. Finally, the remaining complicated images are segmented with the guidance of the samples. The experiments on the iCoseg dataset demonstrate the outperformance and robustness of the proposed method.
Li, Z, Nie, F, Chang, X, Yang, Y, Zhang, C & Sebe, N 2018, 'Dynamic Affinity Graph Construction for Spectral Clustering Using Multiple Features', IEEE Transactions on Neural Networks and Learning Systems, vol. 29, no. 12, pp. 6323-6332.
View/Download from: Publisher's site
View description>>
© 2012 IEEE. Spectral clustering (SC) has been widely applied to various computer vision tasks, where the key is to construct a robust affinity matrix for data partitioning. With the increase in visual features, conventional SC methods are facing two challenges: 1) how to effectively generate an affinity matrix based on multiple features? and 2) how to deal with high-dimensional visual features which could be redundant? To address these issues mentioned earlier, we present a new approach to: 1) learn a robust affinity matrix using multiple features, allowing us to simultaneously determine optimal weights for each feature; and 2) decide a set of optimal projection matrixes, one for each feature, that decide the lower dimensional space, as well as the optimal affinity weight of each data pair in the lower dimensional space. There are two major advantages of our new approach over the existing clustering techniques. First, our approach assigns affinity weights for data points on a per-data-pair basis. The learning procedure avoids the explicit specification of the size of the neighborhood in the affinity matrix, and the bandwidth parameter required to compute the Gaussian kernel, both of which are sensitive and yet difficult to determine beforehand. Second, the affinity weights are based on the distances in a lower dimensional space, while the low-dimensional space is inferred according to the optimized affinity weights. Both variables are jointly optimized so as to leverage mutual benefits. The experimental results outperform the compared alternatives, which indicate that the proposed method is effective in simultaneously learning the affinity graph and feature fusion, resulting in better clustering results.
Lian, D, Zheng, K, Ge, Y, Cao, L, Chen, E & Xie, X 2018, 'GeoMF++', ACM Transactions on Information Systems, vol. 36, no. 3, pp. 1-29.
View/Download from: Publisher's site
View description>>
Location recommendation is an important means to help people discover attractive locations. However, extreme sparsity of user-location matrices leads to a severe challenge, so it is necessary to take implicit feedback characteristics of user mobility data into account and leverage the location’s spatial information. To this end, based on previously developed GeoMF, we propose a scalable and flexible framework, dubbed GeoMF++, for joint geographical modeling and implicit feedback-based matrix factorization. We then develop an efficient optimization algorithm for parameter learning, which scales linearly with data size and the total number of neighbor grids of all locations. GeoMF++ can be well explained from two perspectives. First, it subsumes two-dimensional kernel density estimation so that it captures spatial clustering phenomenon in user mobility data; Second, it is strongly connected with widely used neighbor additive models, graph Laplacian regularized models, and collective matrix factorization. Finally, we extensively evaluate GeoMF++ on two large-scale LBSN datasets. The experimental results show that GeoMF++ consistently outperforms the state-of-the-art and other competing baselines on both datasets in terms of NDCG and Recall. Besides, the efficiency studies show that GeoMF++ is much more scalable with the increase of data size and the dimension of latent space.
Liao, H, Xu, Z, Herrera, F & Merigó, JM 2018, 'Editorial Message: Special Issue on Hesitant Fuzzy Linguistic Decision Making: Algorithms, Theory and Applications', International Journal of Fuzzy Systems, vol. 20, no. 7, pp. 2083-2083.
View/Download from: Publisher's site
Liu, P, Liu, J & Merigó, JM 2018, 'Partitioned Heronian means based on linguistic intuitionistic fuzzy numbers for dealing with multi-attribute group decision making', Applied Soft Computing, vol. 62, pp. 395-422.
View/Download from: Publisher's site
View description>>
© 2017 Elsevier B.V. Heronian mean (HM) operator has the advantages of considering the interrelationships between parameters, and linguistic intuitionistic fuzzy number (LIFN), in which the membership and non-membership are expressed by linguistic terms, can more easily describe the uncertain and the vague information existing in the real world. In this paper, we propose the partitioned Heronian mean (PHM) operator which assumes that all attributes are partitioned into several parts and members in the same part are interrelated while in different parts there are no interrelationships among members, and develop some new operational rules of LIFNs to consider the interactions between membership function and non-membership function, especially when the degree of non-membership is zero. Then we extend PHM operator to LIFNs based on new operational rules, and propose the linguistic intuitionistic fuzzy partitioned Heronian mean (LIFPHM) operator, the linguistic intuitionistic fuzzy weighted partitioned Heronian mean (LIFWPHM) operator, the linguistic intuitionistic fuzzy partitioned geometric Heronian mean (LIFPGHM) operator and linguistic intuitionistic fuzzy weighted partitioned geometric Heronian mean (LIFWPGHM) operator. Further, we develop two methods to solve multi-attribute group decision making (MAGDM) problems with the linguistic intuitionistic fuzzy information. Finally, we give some examples to verify the effectiveness of two proposed methods by comparing with the existing
Liu, Q, Chen, P, Wang, B, Zhang, J & Li, J 2018, 'dbMPIKT: a database of kinetic and thermodynamic mutant protein interactions', BMC Bioinformatics, vol. 19, no. 1.
View/Download from: Publisher's site
Liu, Q, Wu, R, Chen, E, Xu, G, Su, Y, Chen, Z & Hu, G 2018, 'Fuzzy Cognitive Diagnosis for Modelling Examinee Performance', ACM Transactions on Intelligent Systems and Technology, vol. 9, no. 4, pp. 1-26.
View/Download from: Publisher's site
View description>>
Recent decades have witnessed the rapid growth of educational data mining (EDM), which aims at automatically extracting valuable information from large repositories of data generated by or related to people’s learning activities in educational settings. One of the key EDM tasks is cognitive modelling with examination data, and cognitive modelling tries to profile examinees by discovering their latent knowledge state and cognitive level (e.g. the proficiency of specific skills). However, to the best of our knowledge, the problem of extracting information from both objective and subjective examination problems to achieve more precise and interpretable cognitive analysis remains underexplored. To this end, we propose a fuzzy cognitive diagnosis framework (FuzzyCDF) for examinees’ cognitive modelling with both objective and subjective problems. Specifically, to handle the partially correct responses on subjective problems, we first fuzzify the skill proficiency of examinees. Then we combine fuzzy set theory and educational hypotheses to model the examinees’ mastery on the problems based on their skill proficiency. Finally, we simulate the generation of examination score on each problem by considering slip and guess factors. In this way, the whole diagnosis framework is built. For further comprehensive verification, we apply our FuzzyCDF to three classical cognitive assessment tasks, i.e., predicting examinee performance, slip and guess detection, and cognitive diagnosis visualization. Extensive experiments on three real-world datasets for these assessment tasks prove that FuzzyCDF can reveal the knowledge states and cognitive level of the examinees effectively and interpretatively.
Llopis-Albert, C, Merigó, JM, Liao, H, Xu, Y, Grima-Olmedo, J & Grima-Olmedo, C 2018, 'Water Policies and Conflict Resolution of Public Participation Decision-Making Processes Using Prioritized Ordered Weighted Averaging (OWA) Operators', Water Resources Management, vol. 32, no. 2, pp. 497-510.
View/Download from: Publisher's site
View description>>
© 2017, Springer Science+Business Media B.V. There is a growing interest in environmental policies about how to implement public participation engagement in the context of water resources management. This paper presents a robust methodology, based on ordered weighted averaging (OWA) operators, to conflict resolution decision-making problems under uncertain environments due to both information and stakeholders’ preferences. The methodology allows integrating heterogeneous interests of the general public and stakeholders on account of their different degree of acceptance or preference and level of influence or power regarding the measures and policies to be adopted, and also of their level of involvement (i.e., information supply, consultation and active involvement). These considerations lead to different environmental and socio-economic outcomes, and levels of stakeholders’ satisfaction. The methodology establishes a prioritization relationship over the stakeholders. The individual stakeholders’ preferences are aggregated through their associated weights, which depend on the satisfaction of the higher priority decision maker. The methodology ranks the optimal management strategies to maximize the stakeholders’ satisfaction. It has been successfully applied to a real case study, providing greater fairness, transparency, social equity and consensus among actors. Furthermore, it provides support to environmental policies, such as the EU Water Framework Directive (WFD), improving integrated water management while covering a wide range of objectives, management alternatives and stakeholders.
Llopis‐Albert, C, Merigó, JM, Xu, Y & Liao, H 2018, 'Application of Fuzzy Set/Qualitative Comparative Analysis to Public Participation Projects in Support of the EU Water Framework Directive', Water Environment Research, vol. 90, no. 1, pp. 74-83.
View/Download from: Publisher's site
View description>>
ABSTRACT: This study analyzes the level of satisfaction of stakeholders in the public participation process (PPP) of water resources management, which is mandatory according to the EU Water Framework Directive (WFD). The methodology uses a fuzzy set/qualitative comparative analysis (fsQCA), which allows the identification of a combination of factors that lead to the outcome that is stakeholders' satisfaction. It allows dealing with uncertain environments due to the heterogeneous nature of stakeholders and factors. The considered causes range from environmental objectives pursued, actual capacity of efficiently carrying out those objectives, socioeconomic development of the region, level of involvement and means of participation of the stakeholders engaged in the PPP, and alternative policies and measures that should be performed. Results support the argument that different causal paths explain the stakeholders' satisfaction. The methodology may help in the implementation of the WFD and conflict resolution since it leads to greater fairness, social equity, and consensus among stakeholders.
Ma, Y, Yu, Z, Han, G, Li, J & Anh, V 2018, 'Identification of pre-microRNAs by characterizing their sequence order evolution information and secondary structure graphs', BMC Bioinformatics, vol. 19, no. S19, pp. 521-521.
View/Download from: Publisher's site
View description>>
BACKGROUND:Distinction between pre-microRNAs (precursor microRNAs) and length-similar pseudo pre-microRNAs can reveal more about the regulatory mechanism of RNA biological processes. Machine learning techniques have been widely applied to deal with this challenging problem. However, most of them mainly focus on secondary structure information of pre-microRNAs, while ignoring sequence-order information and sequence evolution information. RESULTS:We use new features for the machine learning algorithms to improve the classification performance by characterizing both sequence order evolution information and secondary structure graphs. We developed three steps to extract these features of pre-microRNAs. We first extract features from PSI-BLAST profiles and Hilbert-Huang transforms, which contain rich sequence evolution information and sequence-order information respectively. We then obtain properties of small molecular networks of pre-microRNAs, which contain refined secondary structure information. These structural features are carefully generated so that they can depict both global and local characteristics of pre-microRNAs. In total, our feature space covers 591 features. The maximum relevance and minimum redundancy (mRMR) feature selection method is adopted before support vector machine (SVM) is applied as our classifier. The constructed classification model is named MicroRNA -NHPred. The performance of MicroRNA -NHPred is high and stable, which is better than that of those state-of-the-art methods, achieving an accuracy of up to 94.83% on same benchmark datasets. CONCLUSIONS:The high prediction accuracy achieved by our proposed method is attributed to the design of a comprehensive feature set on the sequences and secondary structures, which are capable of characterizing the sequence evolution information and sequence-order information, and global and local information of pre-microRNAs secondary structures. MicroRNA -NHPred is a valuable method for pre-microRNAs i...
Maldonado, S, Merigó, J & Miranda, J 2018, 'Redefining support vector machines with the ordered weighted average', Knowledge-Based Systems, vol. 148, pp. 41-46.
View/Download from: Publisher's site
View description>>
© 2018 Elsevier B.V. In this work, the classical soft-margin Support Vector Machine (SVM) formulation is redefined with the inclusion of an Ordered Weighted Averaging (OWA) operator. In particular, the hinge loss function is rewritten as a weighted sum of the slack variables to guarantee adequate model fit. The proposed two-step approach trains a soft-margin SVM first to obtain the slack variables, which are then used to induce the order for the OWA operator in a second SVM training. Originally developed as a linear method, our proposal extends it to nonlinear classification thanks to the use of Kernel functions. Experimental results show that the proposed method achieved the best overall performance compared with standard SVM and other well-known data mining methods in terms of predictive performance.
Martínez-López, FJ, Merigó, JM, Valenzuela-Fernández, L & Nicolás, C 2018, 'Fifty years of the European Journal of Marketing: a bibliometric analysis', European Journal of Marketing, vol. 52, no. 1/2, pp. 439-468.
View/Download from: Publisher's site
View description>>
PurposeThe European Journal of Marketing was created in 1967. In 2017, the journal celebrates its 50th anniversary. Therefore, the purpose of this study is to present a bibliometric overview of the leading trends of the journal during this period.Design/methodology/approachThis work uses the Scopus database to analyse the most productive authors, institutions and countries, as well as the most cited papers and the citing articles. The investigation uses bibliometric indicators to represent the bibliographic data, including the total number of publications and citations between 1967 and 2017. Additionally, the article also develops a graphical visualization of the bibliographic material by using the visualization of similarities viewer software to map journals, keywords and institutions with bibliographic coupling and co-citation analysis.FindingsBritish authors and institutions are the most productive in the journal, although Australians’ are growing significantly the number of papers published. Continental European institutions are also increasing the number of publications, but they are still far from reaching the British contribution so far. In the mid-term, however, these zone’s authors and institutions, especially those from big European countries like France, Germany, Italy and Spain, should reach a closer performance to British ones; more as less long, historic, but more recent periods of analysis are considered.Practical implicationsThis article is useful for any reader of this journal to understand questions such as papers’ Eu...
Mauleon-mendez, E, Genovart-balaguer, J, Merigo, J & Mulet-forteza, C 2018, 'Sustainable Tourism Research Towards Twenty-Five Years of the Journal of Sustainable Tourism', Advances in Hospitality and Tourism Research (AHTR), vol. 6, no. 1, pp. 23-46.
View/Download from: Publisher's site
View description>>
The Journal of Sustainable Tourism(JOST) is a main journal in ‘Geography, Planning and Development’. This paperpresents a general overview of the journal over its lifetime by using bibliometricindicators. The paper uses the Scopus database to analyse the bibliometricdata. This analysis includes key issues such as the publication and citationstructure of the journal; the most cited articles; the leading authors,institutions, and countries in the journal; and the keywords that are mostoften used. This paper also uses the visualization of similarities tographically map the bibliographic material. This analysis provides furtherinsights into how JOST links to other journals and how it links researchersacross the globe. These results indicate that JOST is one of the leadingjournals in the areas where the journal is indexed,with a wide range of authors from institutions and countries from allover the world publishing in it.
Meng, F, Rui, X, Wang, Z, Xing, Y & Cao, L 2018, 'Coupled Node Similarity Learning for Community Detection in Attributed Networks', Entropy, vol. 20, no. 6, pp. 471-471.
View/Download from: Publisher's site
View description>>
© 2018 by the authors. Attributed networks consist of not only a network structure but also node attributes. Most existing community detection algorithms only focus on network structures and ignore node attributes, which are also important. Although some algorithms using both node attributes and network structure information have been proposed in recent years, the complex hierarchical coupling relationships within and between attributes, nodes and network structure have not been considered. Such hierarchical couplings are driving factors in community formation. This paper introduces a novel coupled node similarity (CNS) to involve and learn attribute and structure couplings and compute the similarity within and between nodes with categorical attributes in a network. CNS learns and integrates the frequency-based intra-attribute coupled similarity within an attribute, the co-occurrence-based inter-attribute coupled similarity between attributes, and coupled attribute-to-structure similarity based on the homophily property. CNS is then used to generate the weights of edges and transfer a plain graph to a weighted graph. Clustering algorithms detect community structures that are topologically well-connected and semantically coherent on the weighted graphs. Extensive experiments verify the effectiveness of CNS-based community detection algorithms on several data sets by comparing with the state-of-the-art node similarity measures, whether they involve node attribute information and hierarchical interactions, and on various levels of network structure complexity.
Merigó, JM, Gil-Lafuente, AM, Yu, D & Llopis-Albert, C 2018, 'Fuzzy decision making in complex frameworks with generalized aggregation operators', Applied Soft Computing, vol. 68, pp. 314-321.
View/Download from: Publisher's site
View description>>
© 2018 Elsevier B.V. This article presents a new aggregation system applied to fuzzy decision making. The fuzzy generalized unified aggregation operator (FGUAO) is a system that integrates many operators by adding a new aggregation process that considers the relevance that each operator has in the analysis. It also deals with an uncertain environment where the information is studied with fuzzy numbers. A wide range of particular cases and properties are studied. This approach is further extended by using quasi-arithmetic means. The paper ends studying the applicability in decision making problems regarding the European Union decisions. For doing so, the work uses a multi-person aggregation process obtaining the multi-person – FGUAO operator. An example concerning the fixation of the interest rate by the European Central Bank is presented.
Merigó, JM, Pedrycz, W, Weber, R & de la Sotta, C 2018, 'Fifty years of Information Sciences: A bibliometric overview', Information Sciences, vol. 432, pp. 245-268.
View/Download from: Publisher's site
View description>>
© 2017 Elsevier Inc. Information Sciences is a leading international journal in computer science launched in 1968, so becoming fifty years old in 2018. In order to celebrate its anniversary, this study presents a bibliometric overview of the leading publication and citation trends occurring in the journal. The aim of the work is to identify the most relevant authors, institutions, countries, and analyze their evolution through time. The paper uses the Web of Science Core Collection in order to search for the bibliographic information. Our study also develops a graphical mapping of the bibliometric material by using the visualization of similarities (VOS) viewer. With this software, the work analyzes bibliographic coupling, citation and co-citation analysis, co-authorship, and co-occurrence of keywords. The results underline the significant growth of the journal through time and its international diversity having publications from countries all over the world.
Merigó, JM, Zhou, L, Yu, D, Alrajeh, N & Alnowibet, K 2018, 'Probabilistic OWA distances applied to asset management', Soft Computing, vol. 22, no. 15, pp. 4855-4878.
View/Download from: Publisher's site
View description>>
© 2018, Springer-Verlag GmbH Germany, part of Springer Nature. Average distances are widely used in many fields for calculating the distances between two sets of elements. This paper presents several new average distances by using the ordered weighted average, the probability and the weighted average. First, the work presents the probabilistic ordered weighted averaging weighted average distance (POWAWAD) operator. POWAWAD is a new aggregation operator that uses distance measures in a unified framework between the probability, the weighted average and the ordered weighted average (OWA) operator that considers the degree of importance that each concept has in the aggregation. The POWAWAD operator includes a wide range of particular cases including the maximum distance, the minimum distance, the normalized Hamming distance, the weighted Hamming distance and the ordered weighted average distance (OWAD). The article also presents further generalizations by using generalized and quasi-arithmetic means forming the generalized probabilistic ordered weighted averaging weighted average distance (GPOWAWAD) operator and the quasi-POWAWAD operator. The study ends analysing the applicability of this new approach in the calculation of the average fixed assets. Particularly, the work focuses on measuring the average distances between the ideal percentage of fixed assets that the companies of a specific country should have versus the real percentage of fixed assets they have. The illustrative example focuses on the Asian market.
Mirtalaie, MA, Hussain, OK, Chang, E & Hussain, FK 2018, 'Extracting sentiment knowledge from pros/cons product reviews: Discovering features along with the polarity strength of their associated opinions', Expert Systems with Applications, vol. 114, pp. 267-288.
View/Download from: Publisher's site
View description>>
Sentiment knowledge extraction is a growing area of research in the literature. It helps in analyzing users’ opinions about different entities or events, which can then be utilized by analysts for various purposes. Particularly, feature-based sentiment analysis is one of the challenging research areas that analyzes users’ opinions on various features of a product or service. Of the three formats for the product reviews, our focus in this paper is limited to analyzing the pros/cons type. Due to the nature of pros/cons reviews, they are mostly concise and follow a different structure from other review types. Therefore, specialized techniques are needed to analyze these reviews and extract the customers’ discussed product features along with their personal attitudes. In this paper, we propose the Pros/Cons Sentiment Analyzer (PCSA) framework that exploits dependency relations in extracting sentiment knowledge from pros/cons reviews. We also utilize two different lexicons to ascertain the polarity strength of the extracted features based on the customers’ opinions. Several experiments are conducted to evaluate the performance of PCSA in its different phases.
Mulet-Forteza, C, Martorell-Cunill, O, Merigó, JM, Genovart-Balaguer, J & Mauleon-Mendez, E 2018, 'Twenty five years of the Journal of Travel & Tourism Marketing: a bibliometric ranking', Journal of Travel & Tourism Marketing, vol. 35, no. 9, pp. 1201-1221.
View/Download from: Publisher's site
View description>>
© 2018, © 2018 Informa UK Limited, trading as Taylor & Francis Group. The Journal of Travel & Tourism Marketing (JTTM) is a leading international journal in “Marketing” and “Tourism, Leisure and Hospitality Management.” JTTM published its first issue in 1992. In 2017, the journal has celebrated its twenty-fifth anniversary. For that reason, this study analyzes all the publications in the journal since its creation by using a bibliometric approach. The objective is to provide a complete overview of the main factors that affect the journal. This analysis includes key issues such as the distribution of annual publications and citations, the most cited papers, the h-index, citations per paper, the keywords that are mostly used, the influence on the publishing industry and authors, universities, and the countries that have the most publications. The paper uses the Scopus database to analyze the bibliometric data. Additionally, the paper also uses the visualization of similarities (VOS) viewer software to map graphically the bibliographic material. The graphical analysis uses bibliographic coupling, co-citation, citation, and co-occurrence of keywords. These results indicate that JTTM is one of the leading journals in the areas where the journal is indexed, with publications from a wide range of authors, institutions, and countries around the world.
Mustapha, S, Braytee, A & Ye, L 2018, 'Multisource Data Fusion for Classification of Surface Cracks in Steel Pipes', Journal of Nondestructive Evaluation, Diagnostics and Prognostics of Engineering Systems, vol. 1, no. 2.
View/Download from: Publisher's site
View description>>
This paper focuses on the development and validation of a robust framework for surface crack detection and assessment in steel pipes based on measured vibration responses collected using a network of piezoelectric (PZT) wafers. The pipe structure considered in this study contained multiple progressive cracks occurring at different locations and with various orientations (along the circumference or length). The fusion of data collected from multiple PZT wafers was investigated based on two approaches: (a) combining the raw data from all sensors before establishing a statistical model for damage classification and (b) combining the features from each sensor after applying a multiclass support vector machine recursive feature elimination (MCSVM-RFE), for dimensionality reduction, and taking the union of discriminative features among the different sources of data. A MCSVM learning algorithm was employed to train the data and generate a statistical classifier. The dataset consisted of ten classes, consisting of nine damage cases and the healthy state. The accuracy of the prediction based on the two fusion approaches resulted in a high accuracy, exceeding 95%, but the number of features needed to enrich the accuracy (95%) differed between the two approaches. Furthermore, the performance and the precision in the prediction of the classifier were evaluated when the data from only a single sensor was used compared with the combined data from all the sensors within the network. Very promising results in the classification of damage were obtained, based on the case study that included multiple damage scenarios with different lengths and orientations.
Nawaz, F, Janjua, NK, Hussain, OK, Hussain, FK, Chang, E & Saberi, M 2018, 'Event-driven approach for predictive and proactive management of SLA violations in the Cloud of Things', Future Generation Computer Systems, vol. 84, pp. 78-97.
View/Download from: Publisher's site
View description>>
© 2018 Elsevier B.V. In a dynamic environment such as the cloud-of-things, one of the most critical factors for successful service delivery is the QoS under defined constraints. Even though guarantees in the form of service level agreements (SLAs) are provided to users, many services exhibit dynamic Quality of Service (QoS) variations. This QoS variation as well as changes in the behavior and state of the service is caused by some internal events (such as varying loads) and external events (such as location and weather), which results in frequent SLA violations. Most of the existing violation prediction approaches use historic data to predict future QoS values. They do not consider dynamic changes and the events that cause these changes in QoS attributes. In this paper, we propose an event-driven-based proactive approach for predicting SLA violations by combining logic-based reasoning and probabilistic inferencing. The results show that our proposed approach is efficient and proactively identifies SLA violations under uncertain QoS observations.
Olvera, C, Berbegal-Mirabent, J & Merigó, JM 2018, 'A Bibliometric Overview of University-Business Collaboration between 1980 and 2016', Computación y Sistemas, vol. 22, no. 4, pp. 1171-1190.
View/Download from: Publisher's site
View description>>
© 2018 Instituto Politecnico Nacional. All rights reserved. Bibliometrics is a research field that analyses bibliographic material from a quantitative point of view. Aiming at providing a comprehensive overview, this study scrutinises the academic literature in university business collaboration and technology transfer research for the period post the Bayh-Dole Act (1980-2016). The study employs the Web of Science as the main database from where information is collected. Bibliometric indicators such as number of publications, citations, productivity, and the H-index are used to analyse the results. The main findings are displayed in the form of tables and are further discussed. The focus is on the identification of the most relevant journals in this area, the most cited papers, most prolific authors, leading institutions, and countries. The results show that the USA, England, Spain, Italy, and the Netherlands are highly active in this area. Scientific production tends to fall within the research areas of business and economics, engineering or public administration, and is mainly published in journals such as Research Policy, Technovation and Journal of Technology Transfer.
Ouyang, Y, Guo, B, Guo, T, Cao, L & Yu, Z 2018, 'Modeling and Forecasting the Popularity Evolution of Mobile Apps', Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, vol. 2, no. 4, pp. 1-23.
View/Download from: Publisher's site
View description>>
In recent years, with the rapid development of mobile app ecosystem, the number and categories of mobile apps have grown tremendously. However, the global prevalence of mobile apps also leads to fierce competition. As a result, many apps will disappear. To thrive in this competitive app market, it is vital for app developers to understand the popularity evolution of their mobile apps, and inform strategic decision-making for better mobile app development. Therefore, it is significant and necessary to model and forecast the future popularity evolution of mobile apps. The popularity evolution of mobile apps is usually a long-term process, affected by various complex factors. However, existing works lack the capabilities to model such complex factors. To better understand the popularity evolution, in this paper, we aim to forecast the popularity evolution of mobile apps by incorporating complex factors, i.e., exogenous stimulis and endogenous excitations. Specifically, we propose a model based on the Multivariate Hawkes Process (MHP), which is an exogenous stimulis-driven self-exciting point process, to model the exogenous stimulis and endogenous excitations simultaneously. Extensive experimental studies on a real-world dataset from app store demonstrate that MHP outperforms the state-of-the-art methods regarding popularity evolution forecasting.
Peng, H, Zheng, Y, Blumenstein, M, Tao, D & Li, J 2018, 'CRISPR/Cas9 cleavage efficiency regression through boosting algorithms and Markov sequence profiling', Bioinformatics, vol. 34, no. 18, pp. 3069-3077.
View/Download from: Publisher's site
View description>>
AbstractMotivationCRISPR/Cas9 system is a widely used genome editing tool. A prediction problem of great interests for this system is: how to select optimal single-guide RNAs (sgRNAs), such that its cleavage efficiency is high meanwhile the off-target effect is low.ResultsThis work proposed a two-step averaging method (TSAM) for the regression of cleavage efficiencies of a set of sgRNAs by averaging the predicted efficiency scores of a boosting algorithm and those by a support vector machine (SVM). We also proposed to use profiled Markov properties as novel features to capture the global characteristics of sgRNAs. These new features are combined with the outstanding features ranked by the boosting algorithm for the training of the SVM regressor. TSAM improved the mean Spearman correlation coefficiencies comparing with the state-of-the-art performance on benchmark datasets containing thousands of human, mouse and zebrafish sgRNAs. Our method can be also converted to make binary distinctions between efficient and inefficient sgRNAs with superior performance to the existing methods. The analysis reveals that highly efficient sgRNAs have lower melting temperature at the middle of the spacer, cut at 5’-end closer parts of the genome and contain more ‘A’ but less ‘G’ comparing with inefficient ones. Comprehensive further analysis also demonstrates that our tool can predict an sgRNA’s cutting efficiency with consistently good performance no matter it is expressed from an U6 promoter in cells or from a T7 promoter in vitro.Availability and implementationOnline tool is available at http://www.aai-bioinfo.com/CRISPR/. Python and Matlab source codes are freely available at https://github.com/penn-hui/TSAM.Supplementary information
Peng, H, Zheng, Y, Zhao, Z, Liu, T & Li, J 2018, 'Recognition of CRISPR/Cas9 off-target sites through ensemble learning of uneven mismatch distributions', Bioinformatics, vol. 34, no. 17, pp. i757-i765.
View/Download from: Publisher's site
View description>>
Abstract Motivation CRISPR/Cas9 is driving a broad range of innovative applications from basic biology to biotechnology and medicine. One of its current issues is the effect of off-target editing that should be critically resolved and should be completely avoided in the ideal use of this system. Results We developed an ensemble learning method to detect the off-target sites of a single guide RNA (sgRNA) from its thousands of genome-wide candidates. Nucleotide mismatches between on-target and off-target sites have been studied recently. We confirm that there exists strong mismatch enrichment and preferences at the 5′-end close regions of the off-target sequences. Comparing with the on-target sites, sequences of no-editing sites can be also characterized by GC composition changes and position-specific mismatch binary features. Under this novel space of features, an ensemble strategy was applied to train a prediction model. The model achieved a mean score 0.99 of Aera Under Receiver Operating Characteristic curve and a mean score 0.45 of Aera Under Precision-Recall curve in cross-validations on big datasets, outperforming state-of-the-art methods in various test scenarios. Our predicted off-target sites also correspond very well to those detected by high-throughput sequencing techniques. Especially, two case studies for selecting sgRNAs to cure hearing loss and retinal degeneration partly prove the effectiveness of our method. Availability and implementation The python and matlab version of source codes for detecting off-target sites of a given sgRNA and the supplementary files are...
Qin, M, Jin, D, Lei, K, Gabrys, B & Musial-Gabrys, K 2018, 'Adaptive community detection incorporating topology and content in social networks✰', Knowledge-Based Systems, vol. 161, pp. 342-356.
View/Download from: Publisher's site
View description>>
© 2018 In social network analysis, community detection is a basic step to understand the structure and function of networks. Some conventional community detection methods may have limited performance because they merely focus on the networks’ topological structure. Besides topology, content information is another significant aspect of social networks. Although some state-of-the-art methods started to combine these two aspects of information for the sake of the improvement of community partitioning, they often assume that topology and content carry similar information. In fact, for some examples of social networks, the hidden characteristics of content may unexpectedly mismatch with topology. To better cope with such situations, we introduce a novel community detection method under the framework of non-negative matrix factorization (NMF). Our proposed method integrates topology as well as content of networks and has an adaptive parameter (with two variations) to effectively control the contribution of content with respect to the identified mismatch degree. Based on the disjoint community partition result, we also introduce an additional overlapping community discovery algorithm, so that our new method can meet the application requirements of both disjoint and overlapping community detection. The case study using real social networks shows that our new method can simultaneously obtain the community structures and their corresponding semantic description, which is helpful to understand the semantics of communities. Related performance evaluations on both artificial and real networks further indicate that our method outperforms some state-of-the-art methods while exhibiting more robust behavior when the mismatch between topology and content is observed.
Saberi, M, Theobald, M, Hussain, OK, Chang, E & Hussain, FK 2018, 'Interactive feature selection for efficient customer recognition in contact centers: Dealing with common names', Expert Systems with Applications, vol. 113, pp. 356-376.
View/Download from: Publisher's site
View description>>
© 2018 Elsevier Ltd We propose an interactive decision-making framework to assist a Customer Service Representative (CSR) in the efficient and effective recognition of customer records in a database with many ambiguous entries. Our proposed framework consists of three integrated modules. The first module focuses on the detection and resolution of duplicate records to improve effectiveness and efficiency in customer recognition. The second module determines the level of ambiguity in recognizing an individual customer when there are multiple records with the same name. The third module recommends the series of feature-related questions that the CSR should ask the customer to enable rapid recognition, based on that level of ambiguity. In the first module, the F-Swoosh approach for duplicate detection is used, and in the second module a dynamic programming-based technique is used to determine the level of ambiguity within the customer database for a given name. In the third module, Levenshtein edit distance is used for feature selection in combination with weights based on the Inverse Document Frequency (IDF) of terms. The algorithm that requires the minimum number of questions to be put to the customer to achieve recognition is the algorithm that is chosen. We evaluate the proposed framework on a synthetic dataset and demonstrate how it assists the CSR to rapidly recognize the correct customer.
Schönbach, C, Li, J, Ma, L, Horton, P, Sjaugi, MF & Ranganathan, S 2018, 'A bioinformatics potpourri', BMC Genomics, vol. 19, no. S1.
View/Download from: Publisher's site
Shahsavari, M, Golpayegani, AH, Saberi, M & Hussain, FK 2018, 'Recruiting the K-most influential prospective workers for crowdsourcing platforms', Service Oriented Computing and Applications, vol. 12, no. 3-4, pp. 247-257.
View/Download from: Publisher's site
View description>>
© 2018, Springer-Verlag London Ltd., part of Springer Nature. Viral marketing is widely used by businesses to achieve their marketing objectives using social media. In this work, we propose a customized crowdsourcing approach for viral marketing which aims at efficient marketing based on information propagation through a social network. We term this approach the social community-based crowdsourcing platform and integrate it with an information diffusion model to find the most efficient crowd workers. We propose an intelligent viral marketing framework (IVMF) comprising two modules to achieve this end. The first module identifies the K-most influential users in a given social network for the platform using a novel linear threshold diffusion model. The proposed model considers the different propagation behaviors of the network users in relation to different contexts. Being able to consider multiple topics in the information propagation model as opposed to only one topic makes our model more applicable to a diverse population base. Additionally, the proposed content-based improved greedy (CBIG) algorithm enhances the basic greedy algorithm by decreasing the total amount of computations required in the greedy algorithm (the total influence propagation of a unique node in any step of the greedy algorithm). The highest computational cost of the basic greedy algorithm is incurred on computing the total influence propagation of each node. The results of the experiments reveal that the number of iterations in our CBIG algorithm is much less than the basic greedy algorithm, while the precision in choosing the K influential nodes in a social network is close to the greedy algorithm. The second module of the IVMF framework, the multi-objective integer optimization model, is used to determine which social network should be targeted for viral marketing, taking into account the marketing budget. The overall IVMF framework can be used to select a social network and rec...
Takalkar, M, Xu, M, Wu, Q & Chaczko, Z 2018, 'A survey: facial micro-expression recognition', Multimedia Tools and Applications, vol. 77, no. 15, pp. 19301-19325.
View/Download from: Publisher's site
View description>>
© 2017, Springer Science+Business Media, LLC. Facial expression recognition plays a crucial role in a wide range of applications of psychotherapy, security systems, marketing, commerce and much more. Detecting a macro-expression, which is a direct representation of an ‘emotion,’ is a relatively straight-forward task. Playing a pivotal role as macro-expressions, micro-expressions are more accurate indicators of a train of thought or even subtle, passive or involuntary thoughts. Compared to macro-expressions, identifying micro-expressions is a much more challenging research question because their time spans are narrowed down to a fraction of a second, and can only be defined using a broader classification scale. This paper is an all-inclusive survey-cum-analysis of the various micro-expression recognition techniques. We analyze the general framework for micro-expression recognition system by decomposing the pipeline into fundamental components, namely face detecting, pre-processing, facial feature detection and extraction, datasets, and classification. We discuss the role of these elements and highlight the models and new trends that are followed in their design. Moreover, we provide an extensive analysis of micro-expression recognition systems by comparing their performance. We also discuss the new deep learning features that can, in the near future, replace the hand-crafted features for facial micro-expression recognition. This survey has been developed, focusing on the methodologies applied, databases used, performance regarding recognition accuracy and comparing these to distil the gaps in the efficiencies, future scope, and research potentials. Through this survey, we intend to look into this problem and develop a comprehensive and efficient recognition scheme. This study allows us to identify open issues and to determine future directions for designing real-world micro-expression recognition systems.
Trianni, A, Merigó, JM & Bertoldi, P 2018, 'Ten years of Energy Efficiency: a bibliometric analysis', Energy Efficiency, vol. 11, no. 8, pp. 1917-1939.
View/Download from: Publisher's site
View description>>
© 2018, Springer Nature B.V. Energy Efficiency is an international journal dedicated to research topics connected to energy with a focus on end-use efficiency issues. In 2018, the journal celebrates its 10th anniversary. In order to mark it and analyze not only how the journal has been performing over the years, but also which are the trends for academic debate and research in this journal, this article presents a bibliometric overview of the publication and citation structure of the journal during period 2008–2017. The study relies on the Web of Science Core Collection and the Scopus database to collect the bibliographic results. Additionally, the work exploits the visualization of similarities (VOS) viewer software to map graphically the bibliographic material. The research analyses the most cited papers and the most popular keywords. Moreover, the paper studies how the journal connects with other international journals and identifies the most productive authors, institutions, and countries. The results indicate that the journal has rapidly grown over the years, obtained a merited position in the scientific community, with contributions from authors all over the world (with Europe as the most productive region). Moreover, the journal has focused so far mainly on energy efficiency issues in close relationship with policies and incentives, corporate energy efficiency, consumer behavior, and demand-side management programs, with both industrial, building and transport sectors widely involved. Our discussion concludes with suggested future research avenues, in particular towards coordinated efforts from different disciplines (technical, economic, and sociopsychological ones) to address the emerging energy efficiency challenges.
Tur-Porcar, A, Mas-Tur, A, Merigó, JM, Roig-Tierno, N & Watt, J 2018, 'A Bibliometric History of the Journal of Psychology Between 1936 and 2015', The Journal of Psychology, vol. 152, no. 4, pp. 199-225.
View/Download from: Publisher's site
Valenzuela-Fernández, L, Merigó, JM & Nicolas, C 2018, 'The most influential countries in market orientation', International Journal of Engineering Business Management, vol. 10, pp. 184797901775148-184797901775148.
View/Download from: Publisher's site
View description>>
© The Author(s) 2018. The purpose of this article is to analyze the most productive and influential countries engaging in market orientation (MO) research between 1990 and 2016. This article shows the general trajectories of these countries, the relationships among them, and their research in the area of MO by analyzing results on citations and publications. The article uses applied bibliometric techniques on available information found in the Web of Science. The results show that the 10 leading countries produce more than 70% of total publications, where the United States leads in all indicators, followed by the United Kingdom and China. Furthermore, although there has been a steady increase in overall number of publications, this trend is not shared evenly among different nations.
Valenzuela-Fernández, L, Nicolas, C & Merigo, JM 2018, 'Overview of the leading countries in marketing research between 1990 and 2014', American Journal of Business, vol. 33, no. 4, pp. 134-156.
View/Download from: Publisher's site
View description>>
PurposeThe purpose of this paper is to present a general overview of the most influential countries according to their scientific contributions in marketing for the 1990–2014 period. In this bibliometric-based research, the authors generate a ranking of the 50 most influential nations according to the H-index and citations per paper, co-authorship, citation analysis and bibliographic coupling. The study provides a map that identifies the networks of researchers between countries.Design/methodology/approachThe method used is bibliometric analysis. The relevant research in marketing was extracted from Web of Science Database Core Collection, during the 1990–2014 period; 29,947 published articles in 50 countries were obtained. The investigation used: H-index as the first criterion in creating the country ranking, number of articles (TP) as a proxy for the productivity of each country, the average citation per article (C/P) and the number of citations (TC) to express the influence of a country’s articles. In addition, the study adopts VOSviewer software to identify the collaboration networks of researchers between countries and the links between countries.FindingsThe results reveal a general level that 54 percent of countries have a category H-index greater than 20. In turn, the authors see a steady increase in the number of publications over the five-year periods. The first ten countries account for over 80 percent of all publications of the sample. The USA is presented here as the leader in all indicators and highlights the important role that China has been developing.Research limitations/implications
Wan, Y, Chen, L, Xu, G, Zhao, Z, Tang, J & Wu, J 2018, 'SCSMiner: mining social coding sites for software developer recommendation with relevance propagation', World Wide Web, vol. 21, no. 6, pp. 1523-1543.
View/Download from: Publisher's site
View description>>
© 2018, Springer Science+Business Media, LLC, part of Springer Nature. With the advent of social coding sites, software development has entered a new era of collaborative work. Social coding sites (e.g., GitHub) can integrate social networking and distributed version control in a unified platform to facilitate collaborative developments over the world. One unique characteristic of such sites is that the past development experiences of developers provided on the sites convey the implicit metrics of developer’s programming capability and expertise, which can be applied in many areas, such as software developer recruitment for IT corporations. Motivated by this intuition, we aim to develop a framework to effectively locate the developers with right coding skills. To achieve this goal, we devise a generativ e probabilistic expert ranking model upon which a consistency among projects is incorporated as graph regularization to enhance the expert ranking and a perspective of relevance propagation illustration is introduced. For evaluation, StackOverflow is leveraged to complement the ground truth of expert. Finally, a prototype system, SCSMiner, which provides expert search service based on a real-world dataset crawled from GitHub is implemented and demonstrated.
Wan, Y, Xu, G, Chen, L, Zhao, Z & Wu, J 2018, 'Exploiting cross-source knowledge for warming up community question answering services', Neurocomputing, vol. 320, pp. 25-34.
View/Download from: Publisher's site
View description>>
© 2018 Elsevier B.V. Community Question Answering (CQA) services such as Yahoo! Answers, Quora and StackOverflow are collaborative platforms where users can share and exchange their knowledge explicitly by asking and answering questions. One essential task in CQA is learning topical expertise of users, which may benefit many applications such as question routing and best answers identification. One limitation of existing related works is that they only consider the warm-start users who have posted many questions or answers, while ignoring cold-start users who have few posts. In this paper, we aim to exploit knowledge from cross sources such as GitHub and StackOverflow to build up the richer views of expertise for better CQA. Inspired by the idea of Bayesian co-training, we propose a topical expertise model from the perspective of multi-view learning. Specifically, we incorporate the consistency existing among multiple views into a unified probabilistic graphic model. Comprehensive experiments on two real-world datasets demonstrate the performance of our proposed model with the comparison of some state-of-the-art ones.
Wang, D, Deng, S & Xu, G 2018, 'Sequence-based context-aware music recommendation', Information Retrieval Journal, vol. 21, no. 2-3, pp. 230-252.
View/Download from: Publisher's site
View description>>
© 2017, Springer Science+Business Media, LLC. Contextual factors greatly affect users’ preferences for music, so they can benefit music recommendation and music retrieval. However, how to acquire and utilize the contextual information is still facing challenges. This paper proposes a novel approach for context-aware music recommendation, which infers users’ preferences for music, and then recommends music pieces that fit their real-time requirements. Specifically, the proposed approach first learns the low dimensional representations of music pieces from users’ music listening sequences using neural network models. Based on the learned representations, it then infers and models users’ general and contextual preferences for music from users’ historical listening records. Finally, music pieces in accordance with user’s preferences are recommended to the target user. Extensive experiments are conducted on real world datasets to compare the proposed method with other state-of-the-art recommendation methods. The results demonstrate that the proposed method significantly outperforms those baselines, especially on sparse data.
Wang, D, Deng, S, Zhang, X & Xu, G 2018, 'Learning to embed music and metadata for context-aware music recommendation', World Wide Web, vol. 21, no. 5, pp. 1399-1423.
View/Download from: Publisher's site
View description>>
© 2017, Springer Science+Business Media, LLC, part of Springer Nature. Contextual factors greatly influence users’ musical preferences, so they are beneficial remarkably to music recommendation and retrieval tasks. However, it still needs to be studied how to obtain and utilize the contextual information. In this paper, we propose a context-aware music recommendation approach, which can recommend music pieces appropriate for users’ contextual preferences for music. In analogy to matrix factorization methods for collaborative filtering, the proposed approach does not require music pieces to be represented by features ahead, but it can learn the representations from users’ historical listening records. Specifically, the proposed approach first learns music pieces’ embeddings (feature vectors in low-dimension continuous space) from music listening records and corresponding metadata. Then it infers and models users’ global and contextual preferences for music from their listening records with the learned embeddings. Finally, it recommends appropriate music pieces according to the target user’s preferences to satisfy her/his real-time requirements. Experimental evaluations on a real-world dataset show that the proposed approach outperforms baseline methods in terms of precision, recall, F1 score, and hitrate. Especially, our approach has better performance on sparse datasets.
Wang, H, Wu, J, Zhu, X, Chen, Y & Zhang, C 2018, 'Time-Variant Graph Classification', IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 50, no. 8, pp. 1-14.
View/Download from: Publisher's site
View description>>
IEEE Graphs are commonly used to represent objects, such as images and text, for pattern classification. In a dynamic world, an object may continuously evolve over time, and so does the graph extracted from the underlying object. These changes in graph structure with respect to the temporal order present a new representation of the graph, in which an object corresponds to a set of time-variant graphs. In this paper, we formulate a novel time-variant graph classification task and propose a new graph feature, called a graph-shapelet pattern, for learning and classifying time-variant graphs. Graph-shapelet patterns are compact and discriminative graph transformation subsequences. A graph-shapelet pattern can be regarded as a graphical extension of a shapelet--a class of discriminative features designed for vector-based temporal data classification. To discover graph-shapelet patterns, we propose to convert a time-variant graph sequence into time-series data and use the discovered shapelets to find graph transformation subsequences as graph-shapelet patterns. By converting each graph-shapelet pattern into a unique tokenized graph transformation sequence, we can measure the similarity between two graph-shapelet patterns and therefore classify time-variant graphs. Experiments on both synthetic and real-world data demonstrate the superior performance of the proposed algorithms.
Wang, L, Bao, X, Chen, H & Cao, L 2018, 'Effective lossless condensed representation and discovery of spatial co-location patterns', Information Sciences, vol. 436-437, pp. 197-213.
View/Download from: Publisher's site
View description>>
© 2018 Elsevier Inc. A spatial co-location pattern is a set of spatial features frequently co-occuring in nearby geographic spaces. Similar to closed frequent itemset mining, closed co-location pattern (CCP) mining was proposed for losslessly condensing large collections of prevalent co-location patterns. However, the state-of-the-art condensation methods in mining CCP are inspired by closed frequent itemset mining and do not consider the intrinsic characteristics of spatial co-locations, e.g., the participation index and ratio in spatial feature interactions, thus causing serious containment issues in CCP mining. In this paper, we propose a novel lossless condensed representation of prevalent co-location patterns, Super Participation Index-closed (SPI-closed) co-location. An efficient SPI-closed Miner is also proposed to effectively capture the nature of spatial co-location patterns, alongside the development of three additional pruning strategies to make the SPI-closed Miner efficient. This method captures richer feature interactions in spatial co-locations and solves the containment issues in existing CCP methods. A performance evaluation conducted on both synthetic and real-life data sets shows that SPI-closed Miner reduces the number of CCPs by up to 50%, and runs much faster than the baseline CCP mining algorithm described in the literature.
Wang, W, Laengle, S, Merigó, JM, Yu, D, Herrera-Viedma, E, Cobo, MJ & Bouchon-Meunier, B 2018, 'A Bibliometric Analysis of the First Twenty-Five Years of the International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems', International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, vol. 26, no. 02, pp. 169-193.
View/Download from: Publisher's site
View description>>
Since the International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems published its first issue in 1993, it has made important contributions to the research field of computer science. In this study, based on the dataset of the publications published in this journal between 1993 and 2016 retrieved from Web of Science, a general overview of this journal is performed using bibliometric methods and visualized networks. First, the productive and influential publications, authors, institutions, countries/territories, and supraregions are analysed based on the total number of citations, publications, and different citation thresholds. Second, network visualization analysis is applied to illustrate the links and connections between terms by using the VOSviewer software. Moreover, the most cited journals and common author keywords of three continents, including North America, Europe, and Asia, are also presented. This paper will hopefully help researchers understand the research patterns of this journal.
Wang, Y, Zhang, J, Liu, Z, Wu, Q, Zhang, Z & Jia, Y 2018, 'Depth Super-Resolution on RGB-D Video Sequences With Large Displacement 3D Motion', IEEE Transactions on Image Processing, vol. 27, no. 7, pp. 3571-3585.
View/Download from: Publisher's site
View description>>
© 1992-2012 IEEE. To enhance the resolution and accuracy of depth data, some video-based depth super-resolution methods have been proposed, which utilizes its neighboring depth images in the temporal domain. They often consist of two main stages: motion compensation of temporally neighboring depth images and fusion of compensated depth images. However, large displacement 3D motion often leads to compensation error, and the compensation error is further introduced into the fusion. A video-based depth super-resolution method with novel motion compensation and fusion approaches is proposed in this paper. We claim that 3D nearest neighboring field (NNF) is a better choice than using positions with true motion displacement for depth enhancements. To handle large displacement 3D motion, the compensation stage utilized 3D NNF instead of true motion used in the previous methods. Next, the fusion approach is modeled as a regression problem to predict the super-resolution result efficiently for each depth image by using its compensated depth images. A new deep convolutional neural network architecture is designed for fusion, which is able to employ a large amount of video data for learning the complicated regression function. We comprehensively evaluate our method on various RGB-D video sequences to show its superior performance.
Wang, Z & Piccardi, M 2018, 'Minimum-risk temporal alignment of videos', Multimedia Tools and Applications, vol. 77, no. 12, pp. 14891-14906.
View/Download from: Publisher's site
View description>>
Temporal alignment of videos is an important requirement of tasks such as video comparison, analysis and classification. Most of the approaches proposed to date for video alignment leverage dynamic programming algorithms whose parameters are manually tuned. Conversely, this paper proposes a model that can learn its parameters automatically by minimizing a meaningful loss function over a given training set of videos and alignments. For learning, we exploit the effective framework of structural SVM and we extend it with an original scoring function that suitably scores the alignment of two given videos, and a loss function that quantifies the accuracy of a predicted alignment. The experimental results from four video action datasets show that the proposed model has been able to outperform a baseline and a state-of-the-art algorithm by a large margin in terms of alignment accuracy.
Wright, ST, Ryan, LM & Pham, T 2018, 'A novel case‐control subsampling approach for rapid model exploration of large clustered binary data', Statistics in Medicine, vol. 37, no. 6, pp. 899-913.
View/Download from: Publisher's site
View description>>
In many settings, an analysis goal is the identification of a factor, or set of factors associated with an event or outcome. Often, these associations are then used for inference and prediction. Unfortunately, in the big data era, the model building and exploration phases of analysis can be time‐consuming, especially if constrained by computing power (ie, a typical corporate workstation). To speed up this model development, we propose a novel subsampling scheme to enable rapid model exploration of clustered binary data using flexible yet complex model set‐ups (GLMMs with additive smoothing splines). By reframing the binary response prospective cohort study into a case‐control–type design, and using our knowledge of sampling fractions, we show one can approximate the model estimates as would be calculated from a full cohort analysis. This idea is extended to derive cluster‐specific sampling fractions and thereby incorporate cluster variation into an analysis. Importantly, we demonstrate that previously computationally prohibitive analyses can be conducted in a timely manner on a typical workstation. The approach is applied to analysing risk factors associated with adverse reactions relating to blood donation.
Wu, J, Pan, S, Zhou, C, Li, G, He, W & Zhang, C 2018, 'Advances in Processing, Mining, and Learning Complex Data: From Foundations to Real-World Applications', Complexity, vol. 2018, pp. 1-3.
View/Download from: Publisher's site
Wu, J, Pan, S, Zhu, X, Zhang, C & Wu, X 2018, 'Multi-Instance Learning with Discriminative Bag Mapping', IEEE Transactions on Knowledge and Data Engineering, vol. 30, no. 6, pp. 1065-1080.
View/Download from: Publisher's site
View description>>
© 1989-2012 IEEE. Multi-instance learning (MIL) is a useful tool for tackling labeling ambiguity in learning because it allows a bag of instances to share one label. Bag mapping transforms a bag into a single instance in a new space via instance selection and has drawn significant attention recently. To date, most existing work is based on the original space, using all instances inside each bag for bag mapping, and the selected instances are not directly tied to an MIL objective. As a result, it is difficult to guarantee the distinguishing capacity of the selected instances in the new bag mapping space. In this paper, we propose a discriminative mapping approach for multi-instance learning (MILDM) that aims to identify the best instances to directly distinguish bags in the new mapping space. Accordingly, each instance bag can be mapped using the selected instances to a new feature space, and hence any generic learning algorithm, such as an instance-based learning algorithm, can be used to derive learning models for multi-instance classification. Experiments and comparisons on eight different types of real-world learning tasks (including 14 data sets) demonstrate that MILDM outperforms the state-of-The-Art bag mapping multi-instance learning approaches. Results also confirm that MILDM achieves balanced performance between runtime efficiency and classification effectiveness.
Wu, W, Li, B, Chen, L, Zhu, X & Zhang, C 2018, '$K$ -Ary Tree Hashing for Fast Graph Classification', IEEE Transactions on Knowledge and Data Engineering, vol. 30, no. 5, pp. 936-949.
View/Download from: Publisher's site
View description>>
IEEE Existing graph classification usually relies on an exhaustive enumeration of substructure patterns, where the number of substructures expands exponentially w.r.t. with the size of the graph set. Recently, the Weisfeiler-Lehman (WL) graph kernel has achieved the best performance in terms of both accuracy and efficiency among state-of-the-art methods. However, it is still time-consuming, especially for large-scale graph classification tasks. In this paper, we present a < formula > < tex > $k$ < /tex > < /formula > -Ary Tree based Hashing (KATH) algorithm, which is able to obtain competitive accuracy with a very fast runtime. The main idea of KATH is to construct a traversal table to quickly approximate the subtree patterns in WL using < formula > < tex > $k$ < /tex > < /formula > -ary trees. Based on the traversal table, KATH employs a recursive indexing process that performs only < formula > < tex > $r$ < /tex > < /formula > times of matrix indexing to generate all < formula > < tex > $(r-1)$ < /tex > < /formula > -depth < formula > < tex > $k$ < /tex > < /formula > -ary trees, where the leaf node labels of a tree can uniquely specify the pattern. After that, the MinHash scheme is used to fingerprint the acquired subtree patterns for a graph. Our experimental results on both real world and synthetic datasets show that KATH runs significantly faster than state-of-the-art methods while achieving competitive or better accuracy.
Wu, Z, Li, G, Liu, Q, Xu, G & Chen, E 2018, 'Covering the Sensitive Subjects to Protect Personal Privacy in Personalized Recommendation', IEEE Transactions on Services Computing, vol. 11, no. 3, pp. 493-506.
View/Download from: Publisher's site
Wu, Z, Xu, G, Lu, C, Chen, E, Jiang, F & Li, G 2018, 'An effective approach for the protection of privacy text data in the CloudDB', World Wide Web, vol. 21, no. 4, pp. 915-938.
View/Download from: Publisher's site
View description>>
© 2017 Springer Science+Business Media, LLC Due to the advantages of pay-on-demand, expand-on-demand and high availability, cloud databases (CloudDB) have been widely used in information systems. However, since a CloudDB is distributed on an untrusted cloud side, it is an important problem how to effectively protect massive private information in the CloudDB. Although traditional security strategies (such as identity authentication and access control) can prevent illegal users from accessing unauthorized data, they cannot prevent internal users at the cloud side from accessing and exposing personal privacy information. In this paper, we propose a client-based approach to protect personal privacy in a CloudDB. In the approach, privacy data before being stored into the cloud side, would be encrypted using a traditional encryption algorithm, so as to ensure the security of privacy data. To execute various kinds of query operations over the encrypted data efficiently, the encrypted data would be also augmented with additional feature index, so that as much of each query operation as possible can be processed on the cloud side without the need to decrypt the data. To this end, we explore how the feature index of privacy data is constructed, and how a query operation over privacy data is transformed into a new query operation over the index data so that it can be executed on the cloud side correctly. The effectiveness of the approach is demonstrated by theoretical analysis and experimental evaluation. The results show that the approach has good performance in terms of security, usability and efficiency, thus effective to protect personal privacy in the CloudDB.
Xiao, L, Zhang, Y, Zhang, J, Wang, Q & Li, Y 2018, 'Combining HWEBING and HOG‐MLBP features for pedestrian detection', The Journal of Engineering, vol. 2018, no. 16, pp. 1421-1426.
View/Download from: Publisher's site
Xiong, F, Wang, X, Pan, S, Yang, H, Wang, H & Zhang, C 2018, 'Social Recommendation With Evolutionary Opinion Dynamics', IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 50, no. 10, pp. 1-13.
View/Download from: Publisher's site
View description>>
IEEE When users in online social networks make a decision, they are often affected by their neighbors. Social recommendation models utilize social information to reveal the impact of neighbors on user preferences, and this impact is often described by the linear superposition of neighbor preferences or by global trust propagation. Further exploration needs to be undertaken to determine whether the influence pattern of other users from online interaction behaviors is adequately described. In this paper, we introduce evolutionary opinion dynamics from the field of statistical physics into recommender systems, characterizing the impact of other users. We propose an opinion dynamic model by evolutionary game theory. To describe online user interactions, we define the strategies during an interaction between two users, and present the payoff for each strategy in terms of errors of estimated ratings. Therefore, user behaviors are associated with their preferences and ratings. In addition, we measure user influence according to their topological roles in the social network. We incorporate evolutionary opinion dynamics and user influence into the recommendation framework for the prediction of unknown ratings. Experiment results on two real-world datasets demonstrate that our method outperforms state-of the-art models in terms of accuracy, and it also performs well for cold-start users. Our method reduces the divergence of user preferences, in accordance with online opinion interactions. Furthermore, our method has approximate computational complexity with matrix factorization, and results in less computation than state-of-the-art models. Our method is quite general, and indicates that studies in social physics, statistics, and other research fields may be involved in recommendation to improve the performance.
Xuan, J, Lu, J, Zhang, G, Xu, RYD & Luo, X 2018, 'Doubly Nonparametric Sparse Nonnegative Matrix Factorization Based on Dependent Indian Buffet Processes', IEEE Transactions on Neural Networks and Learning Systems, vol. 29, no. 5, pp. 1835-1849.
View/Download from: Publisher's site
View description>>
© 2012 IEEE. Sparse nonnegative matrix factorization (SNMF) aims to factorize a data matrix into two optimized nonnegative sparse factor matrices, which could benefit many tasks, such as document-word co-clustering. However, the traditional SNMF typically assumes the number of latent factors (i.e., dimensionality of the factor matrices) to be fixed. This assumption makes it inflexible in practice. In this paper, we propose a doubly sparse nonparametric NMF framework to mitigate this issue by using dependent Indian buffet processes (dIBP). We apply a correlation function for the generation of two stick weights associated with each column pair of factor matrices while still maintaining their respective marginal distribution specified by IBP. As a consequence, the generation of two factor matrices will be columnwise correlated. Under this framework, two classes of correlation function are proposed: 1) using bivariate Beta distribution and 2) using Copula function. Compared with the single IBP-based NMF, this paper jointly makes two factor matrices nonparametric and sparse, which could be applied to broader scenarios, such as co-clustering. This paper is seen to be much more flexible than Gaussian process-based and hierarchial Beta process-based dIBPs in terms of allowing the two corresponding binary matrix columns to have greater variations in their nonzero entries. Our experiments on synthetic data show the merits of this paper compared with the state-of-the-art models in respect of factorization efficiency, sparsity, and flexibility. Experiments on real-world data sets demonstrate the efficiency of this paper in document-word co-clustering tasks.
Xue, R, Huang, S, Luo, X, Jiang, D & Da Xu, RY 2018, 'Semantic emotion-topic model based social emotion mining', Journal of Web Engineering, vol. 17, no. 1-2, pp. 73-92.
View description>>
With the booming of social media users, more and more short texts with emotion labels appear, which contain users' rich emotions and opinions about social events or enterprise products. Social emotion mining on social media corpus can help government or enterprise make their decisions. Emotion mining models involve statistical-based and graph-based approaches. Among them, the former approaches are more popular, e.g. Latent Dirichlet Allocation (LDA)-based Emotion Topic Model. However, they are suffering from low retrieval performance, such as the bad accuracy and the poor interpretability, due to them only considering the bag-of-words or the emotion labels in social media corpus. In this paper, we propose a LDA-based Semantic Emotion-Topic Model (SETM) combining emotion labels and inter-word relations to enhance the retrieval performance of social emotion mining result. The performance influence of four factors on SETM are considered, i.e., association relations, computing time, topic number and semantic interpretability. Experimental results show that the accuracy of our proposed model is 0.750, compared with 0.606, 0.663 and 0.680 of Emotion Topic Model (ETM), Multi-label Supervised Topic Model (MSTM) and Sentiment Latent Topic Model (SLTM) respectively. Besides, the computing time of our model is reduced by 87.81% through limiting word frequency, and its accuracy is 0.703, compared with 0.501, 0.648 and 0.642 of the above baseline methods. Thus, the proposed model has broad prospects in social emotion mining area.
Yang, E, Deng, C, Li, C, Liu, W, Li, J & Tao, D 2018, 'Shared Predictive Cross-Modal Deep Quantization', IEEE Transactions on Neural Networks and Learning Systems, vol. 29, no. 11, pp. 5292-5303.
View/Download from: Publisher's site
View description>>
© 2012 IEEE. With explosive growth of data volume and ever-increasing diversity of data modalities, cross-modal similarity search, which conducts nearest neighbor search across different modalities, has been attracting increasing interest. This paper presents a deep compact code learning solution for efficient cross-modal similarity search. Many recent studies have proven that quantization-based approaches perform generally better than hashing-based approaches on single-modal similarity search. In this paper, we propose a deep quantization approach, which is among the early attempts of leveraging deep neural networks into quantization-based cross-modal similarity search. Our approach, dubbed shared predictive deep quantization (SPDQ), explicitly formulates a shared subspace across different modalities and two private subspaces for individual modalities, and representations in the shared subspace and the private subspaces are learned simultaneously by embedding them to a reproducing kernel Hilbert space, where the mean embedding of different modality distributions can be explicitly compared. In addition, in the shared subspace, a quantizer is learned to produce the semantics preserving compact codes with the help of label alignment. Thanks to this novel network architecture in cooperation with supervised quantization training, SPDQ can preserve intramodal and intermodal similarities as much as possible and greatly reduce quantization error. Experiments on two popular benchmarks corroborate that our approach outperforms state-of-the-art methods.
Yang, W, Li, J, Zheng, H & Xu, RYD 2018, 'A Nuclear Norm Based Matrix Regression Based Projections Method for Feature Extraction', IEEE Access, vol. 6, pp. 7445-7451.
View/Download from: Publisher's site
View description>>
© 2013 IEEE. In the traditional graph embedding framework, the graph is usually built by k-NN or r-ball. Since it is difficult to manually set the parameters k and r in the high-dimensional space, sparse representation-based methods are usually introduced to automatically build the graphs. In recent years, nuclear norm-based matrix regression (NMR) has been proposed for face recognition using the low rank structural information (i.e., the image matrix-based error model). Inspired by NMR, we give a NMR-based projections (NMRP) method for feature extraction and recognition. The experiments on FERET and extended Yale B face databases show that NMR can be used to build the graph while NMRP is an effective feature extraction method.
Yusoff, B, Merigó, JM, Ceballos, D & Peláez, JI 2018, 'Weighted-selective aggregated majority-OWA operator and its application in linguistic group decision making', International Journal of Intelligent Systems, vol. 33, no. 9, pp. 1929-1948.
View/Download from: Publisher's site
View description>>
© 2018 Wiley Periodicals, Inc. This paper focuses on the aggregation operations in the group decision-making model based on the concept of majority opinion. The weighted-selective aggregated majority-OWA (WSAM-OWA) operator is proposed as an extension of the SAM-OWA operator, where the reliability of information sources is considered in the formulation. The WSAM-OWA operator is generalized to the quantified WSAM-OWA operator by including the concept of linguistic quantifier, mainly for the group fusion strategy. The QWSAM-IOWA operator, with an ordering step, is introduced to the individual fusion strategy. The proposed aggregation operators are then implemented for the case of alternative scheme of heterogeneous group decision analysis. The heterogeneous group includes the consensus of experts with respect to each specific criterion. The exhaustive multicriteria group decision-making model under the linguistic domain, which consists of two-stage aggregation processes, is developed in order to fuse the experts’ judgments and to aggregate the criteria. The model provides greater flexibility when analyzing the decision alternatives with a tolerance that considers the majority of experts and the attitudinal character of experts. A selection of investment problem is given to demonstrate the applicability of the developed model.
Zhang, B, Li, J & Lü, Q 2018, 'Prediction of 8-state protein secondary structures by a novel deep learning architecture', BMC Bioinformatics, vol. 19, no. 1.
View/Download from: Publisher's site
View description>>
© 2018 The Author(s). Background: Protein secondary structure can be regarded as an information bridge that links the primary sequence and tertiary structure. Accurate 8-state secondary structure prediction can significantly give more precise and high resolution on structure-based properties analysis. Results: We present a novel deep learning architecture which exploits an integrative synergy of prediction by a convolutional neural network, residual network, and bidirectional recurrent neural network to improve the performance of protein secondary structure prediction. A local block comprised of convolutional filters and original input is designed for capturing local sequence features. The subsequent bidirectional recurrent neural network consisting of gated recurrent units can capture global context features. Furthermore, the residual network can improve the information flow between the hidden layers and the cascaded recurrent neural network. Our proposed deep network achieved 71.4% accuracy on the benchmark CB513 dataset for the 8-state prediction; and the ensemble learning by our model achieved 74% accuracy. Our model generalization capability is also evaluated on other three independent datasets CASP10, CASP11 and CASP12 for both 8- and 3-state prediction. These prediction performances are superior to the state-of-the-art methods. Conclusion: Our experiment demonstrates that it is a valuable method for predicting protein secondary structure, and capturing local and global features concurrently is very useful in deep learning.
Zhang, G, Piccardi, M & Borzeshi, EZ 2018, 'Sequential Labeling With Structural SVM Under Nondecomposable Losses', IEEE Transactions on Neural Networks and Learning Systems, vol. 29, no. 9, pp. 4177-4188.
View/Download from: Publisher's site
View description>>
IEEE Sequential labeling addresses the classification of sequential data, which are widespread in fields as diverse as computer vision, finance, and genomics. The model traditionally used for sequential labeling is the hidden Markov model (HMM), where the sequence of class labels to be predicted is encoded as a Markov chain. In recent years, HMMs have benefited from minimum-loss training approaches, such as the structural support vector machine (SSVM), which, in many cases, has reported higher classification accuracy. However, the loss functions available for training are restricted to decomposable cases, such as the 0-1 loss and the Hamming loss. In many practical cases, other loss functions, such as those based on the F & #x2081; measure, the precision/recall break-even point, and the average precision (AP), can describe desirable performance more effectively. For this reason, in this paper, we propose a training algorithm for SSVM that can minimize any loss based on the classification contingency table, and we present a training algorithm that minimizes an AP loss. Experimental results over a set of diverse and challenging data sets (TUM Kitchen, CMU Multimodal Activity, and Ozone Level Detection) show that the proposed training algorithms achieve significant improvements of the F & #x2081; measure and AP compared with the conventional SSVM, and their performance is in line with or above that of other state-of-the-art sequential labeling approaches.
Zhang, H, Xu, G, Liang, X, Xu, G, Li, F, Fu, K, Wang, L & Huang, T 2018, 'An Attention-Based Word-Level Interaction Model for Knowledge Base Relation Detection', IEEE Access, vol. 6, pp. 75429-75441.
View/Download from: Publisher's site
View description>>
© 2018 IEEE. Relation detection plays a crucial role in knowledge base question answering, and it is challenging because of the high variance of relation expression in real-world questions. Traditional relation detection models based on deep learning follow an encoding-comparing paradigm, where the question and the candidate relation are represented as vectors to compare their semantic similarity. Max-or average-pooling operation, which is used to compress the sequence of words into fixed-dimensional vectors, becomes the bottleneck of information flow. In this paper, we propose an attention-based word-level interaction model (ABWIM) to alleviate the information loss issue caused by aggregating the sequence into a fixed-dimensional vector before the comparison. First, attention mechanism is adopted to learn the soft alignments between words from the question and the relation. Then, fine-grained comparisons are performed on the aligned words. Finally, the comparison results are merged with a simple recurrent layer to estimate the semantic similarity. Besides, a dynamic sample selection strategy is proposed to accelerate the training procedure without decreasing the performance. Experimental results of relation detection on both SimpleQuestions and WebQuestions datasets show that ABWIM achieves the state-of-the-art accuracy, demonstrating its effectiveness.
Zhang, J, McBurney, P & Musial, K 2018, 'Convergence of trading strategies in continuous double auction markets with boundedly-rational networked traders', Review of Quantitative Finance and Accounting, vol. 50, no. 1, pp. 301-352.
View/Download from: Publisher's site
View description>>
© 2017, Springer Science+Business Media New York. This paper considers the convergence of trading strategies among artificial traders connected to one another in a social network and trading in a continuous double auction financial marketplace. Convergence is studied by means of an agent-based simulation model called the Social Network Artificial stoCk marKet model. Six different canonical network topologies (including no-network) are used to represent the possible connections between artificial traders. Traders learn from the trading experiences of their connected neighbours by means of reinforcement learning. The results show that the proportions of traders using particular trading strategies are eventually stable. Which strategies dominate in these stable states depends to some extent on the particular network topology of trader connections and the types of traders.
Zhang, Q, Wu, J, Zhang, Q, Zhang, P, Long, G & Zhang, C 2018, 'Dual influence embedded social recommendation', World Wide Web, vol. 21, no. 4, pp. 849-874.
View/Download from: Publisher's site
View description>>
© 2017, Springer Science+Business Media, LLC. Recommender systems are designed to solve the information overload problem and have been widely studied for many years. Conventional recommender systems tend to take ratings of users on products into account. With the development of Web 2.0, Rating Networks in many online communities (e.g. Netflix and Douban) allow users not only to co-comment or co-rate their interests (e.g. movies and books), but also to build explicit social networks. Recent recommendation models use various social data, such as observable links, but these explicit pieces of social information incorporating recommendations normally adopt similarity measures (e.g. cosine similarity) to evaluate the explicit relationships in the network - they do not consider the latent and implicit relationships in the network, such as social influence. A target user’s purchase behavior or interest, for instance, is not always determined by their directly connected relationships and may be significantly influenced by the high reputation of people they do not know in the network, or others who have expertise in specific domains (e.g. famous social communities). In this paper, based on the above observations, we first simulate the social influence diffusion in the network to find the global and local influence nodes and then embed this dual influence data into a traditional recommendation model to improve accuracy. Mathematically, we formulate the global and local influence data as new dual social influence regularization terms and embed them into a matrix factorization-based recommendation model. Experiments on real-world datasets demonstrate the effective performance of the proposed method.
Zhang, Z, Chen, J, Wu, Q & Shao, L 2018, 'GII Representation-Based Cross-View Gait Recognition by Discriminative Projection With List-Wise Constraints', IEEE Transactions on Cybernetics, vol. 48, no. 10, pp. 2935-2947.
View/Download from: Publisher's site
View description>>
© 2017 IEEE. Remote person identification by gait is one of the most important topics in the field of computer vision and pattern recognition. However, gait recognition suffers severely from the appearance variance caused by the view change. It is very common that gait recognition has a high performance when the view is fixed but the performance will have a sharp decrease when the view variance becomes significant. Existing approaches have tried all kinds of strategies like tensor analysis or view transform models to slow down the trend of performance decrease but still have potential for further improvement. In this paper, a discriminative projection with list-wise constraints (DPLC) is proposed to deal with view variance in cross-view gait recognition, which has been further refined by introducing a rectification term to automatically capture the principal discriminative information. The DPLC with rectification (DPLCR) embeds list-wise relative similarity measurement among intraclass and inner-class individuals, which can learn a more discriminative and robust projection. Based on the original DPLCR, we have introduced the kernel trick to exploit nonlinear cross-view correlations and extended DPLCR to deal with the problem of multiview gait recognition. Moreover, a simple yet efficient gait representation, namely gait individuality image (GII), based on gait energy image is proposed, which could better capture the discriminative information for cross view gait recognition. Experiments have been conducted in the CASIA-B database and the experimental results demonstrate the outstanding performance of both the DPLCR framework and the new GII representation. It is shown that the DPLCR-based cross-view gait recognition has outperformed the-state-of-the-art approaches in almost all cases under large view variance. The combination of the GII representation and the DPLCR has further enhanced the performance to be a new benchmark for cross-view gait recognition.
Zhao, J, Mao, X & Zhang, J 2018, 'Learning deep facial expression features from image and optical flow sequences using 3D CNN', The Visual Computer, vol. 34, no. 10, pp. 1461-1475.
View/Download from: Publisher's site
View description>>
© 2018, Springer-Verlag GmbH Germany, part of Springer Nature. Facial expression is highly correlated with the facial motion. According to whether the temporal information of facial motion is used or not, the facial expression features can be classified as static and dynamic features. The former, which mainly includes the geometric features and appearance features, can be extracted by convolution or other learning filters; the latter, which are aimed to model the dynamic properties of facial motion, can be calculated through optical flow or other methods, respectively. When 3D convolutional neural networks (CNNs) are introduced, the extraction of two different types of features mentioned above becomes easy. In this paper, one 3D CNN architecture is presented to learn the static and dynamic features from facial image sequences and extract high-level dynamic features from optical flow sequences. Two types of dense optical flow, which contain the tracking information of facial muscle movement, are calculated according to different image pair construction methods. One is the common optical flow, and the other is an enhanced optical flow which is called accumulative optical flow. Four components of each type of optical flow are used in experiments. Three databases, two acted databases and one nearly realistic database, are selected to conduct the experiments. The experiments on the two acted databases achieve state-of-the-art accuracy, and indicate that the vertical component of optical flow has an advantage over other components in recognizing facial expression. The experimental results on the three selected databases show that more discriminative features can be learned from image sequences than from optical flow or accumulative optical flow sequences, and the accumulative optical flow contains more motion information than optical flow if the frame distance of the image pairs used to calculate them is not too large.
Zhao, L, Wu, S, Jiang, J, Li, W, Luo, J & Li, J 2018, 'Novel overlapping subgraph clustering for the detection of antigen epitopes', Bioinformatics, vol. 34, no. 12, pp. 2061-2068.
View/Download from: Publisher's site
View description>>
Abstract Motivation Antigens that contain overlapping epitopes have been occasionally reported. As current algorithms mainly take a one-antigen-one-epitope approach to the prediction of epitopes, they are not capable of detecting these multiple and overlapping epitopes accurately, or even those multiple and separated epitopes existing in some other antigens. Results We introduce a novel subgraph clustering algorithm for more accurate detection of epitopes. This algorithm takes graph partitions as seeds, and expands the seeds to merge overlapping subgraphs based on the term frequency-inverse document frequency (TF-IDF) featured similarity. Then, the merged subgraphs are each classified as an epitope or non-epitope. Tests of our algorithm were conducted on three newly collected datasets of antigens. In the first dataset, each antigen contains only a single epitope; in the second, each antigen contains only multiple and separated epitopes; and in the third, each antigen contains overlapping epitopes. The prediction performance of our algorithm is significantly better than the state-of-art methods. The lifts of the averaged f-scores on top of the best existing methods are 60, 75 and 22% for the single epitope detection, the multiple and separated epitopes detection, and the overlapping epitopes detection, respectively. Availability and implementation The source code is available at github.com/lzhlab/glep/. Supplementary information Supplementary data are...
Zhao, L, Xie, J, Bai, L, Chen, W, Wang, M, Zhang, Z, Wang, Y, Zhao, Z & Li, J 2018, 'Mining statistically-solid k-mers for accurate NGS error correction', BMC Genomics, vol. 19, no. S10, pp. 912-912.
View/Download from: Publisher's site
View description>>
BACKGROUND:NGS data contains many machine-induced errors. The most advanced methods for the error correction heavily depend on the selection of solid k-mers. A solid k-mer is a k-mer frequently occurring in NGS reads. The other k-mers are called weak k-mers. A solid k-mer does not likely contain errors, while a weak k-mer most likely contains errors. An intensively investigated problem is to find a good frequency cutoff f0 to balance the numbers of solid and weak k-mers. Once the cutoff is determined, a more challenging but less-studied problem is to: (i) remove a small subset of solid k-mers that are likely to contain errors, and (ii) add a small subset of weak k-mers, that are likely to contain no errors, into the remaining set of solid k-mers. Identification of these two subsets of k-mers can improve the correction performance. RESULTS:We propose to use a Gamma distribution to model the frequencies of erroneous k-mers and a mixture of Gaussian distributions to model correct k-mers, and combine them to determine f0. To identify the two special subsets of k-mers, we use the z-score of k-mers which measures the number of standard deviations a k-mer's frequency is from the mean. Then these statistically-solid k-mers are used to construct a Bloom filter for error correction. Our method is markedly superior to the state-of-art methods, tested on both real and synthetic NGS data sets. CONCLUSION:The z-score is adequate to distinguish solid k-mers from weak k-mers, particularly useful for pinpointing out solid k-mers having very low frequency. Applying z-score on k-mer can markedly improve the error correction accuracy.
Zheng, Y, Peng, H, Zhang, X, Zhao, Z, Yin, J & Li, J 2018, 'Predicting adverse drug reactions of combined medication from heterogeneous pharmacologic databases', BMC Bioinformatics, vol. 19, no. S19, pp. 49-59.
View/Download from: Publisher's site
View description>>
BACKGROUND:Early and accurate identification of potential adverse drug reactions (ADRs) for combined medication is vital for public health. Existing methods either rely on expensive wet-lab experiments or detecting existing associations from related records. Thus, they inevitably suffer under-reporting, delays in reporting, and inability to detect ADRs for new and rare drugs. The current application of machine learning methods is severely impeded by the lack of proper drug representation and credible negative samples. Therefore, a method to represent drugs properly and to select credible negative samples becomes vital in applying machine learning methods to this problem. RESULTS:In this work, we propose a machine learning method to predict ADRs of combined medication from pharmacologic databases by building up highly-credible negative samples (HCNS-ADR). Specifically, we fuse heterogeneous information from different databases and represent each drug as a multi-dimensional vector according to its chemical substructures, target proteins, substituents, and related pathways first. Then, a drug-pair vector is obtained by appending the vector of one drug to the other. Next, we construct a drug-disease-gene network and devise a scoring method to measure the interaction probability of every drug pair via network analysis. Drug pairs with lower interaction probability are preferentially selected as negative samples. Following that, the validated positive samples and the selected credible negative samples are projected into a lower-dimensional space using the principal component analysis. Finally, a classifier is built for each ADR using its positive and negative samples with reduced dimensions. The performance of the proposed method is evaluated on simulative prediction for 1276 ADRs and 1048 drugs, comparing using four machine learning algorithms and with two baseline approaches. Extensive experiments show that the proposed way to represent drugs characterizes drugs accu...
Zhu, C, Cao, L, Liu, Q, Yin, J & Kumar, V 2018, 'Heterogeneous Metric Learning of Categorical Data with Hierarchical Couplings', IEEE Transactions on Knowledge and Data Engineering, vol. 30, no. 7, pp. 1254-1267.
View/Download from: Publisher's site
Zhu, M, He, B & Wu, Q 2018, 'Single Image Dehazing Based on Dark Channel Prior and Energy Minimization', IEEE Signal Processing Letters, vol. 25, no. 2, pp. 174-178.
View/Download from: Publisher's site
View description>>
© 2017 IEEE. Hazy images have limited visibility and low contrast. The degradation is expressed by transmission map, which is one of the most important estimates of single image dehazing. Transmission map estimation is an underconstraint problem, and lots of priors have been proposed. Among them, the dark channel prior is widely recognized. However, traditional methods have not fully exploited its power due to improper assumptions or operations, which cause unwanted artifacts. The postrefinement algorithms employed to remove these artifacts in turn undermine the merits of the prior. In this letter, a novel method for estimating transmission map by energy minimization is proposed to solve this problem. The energy function combines the dark channel prior with piecewise smoothness. The method is compared to the state-of-the-art methods and shows outstanding performance.
Zuo, Y, Wu, Q, An, P & Shang, X 2018, 'Integrated cosparse analysis model with explicit edge inconsistency measurement for guided depth map upsampling', Journal of Electronic Imaging, vol. 27, no. 04, pp. 1-1.
View/Download from: Publisher's site
View description>>
© 2018 SPIE and IS & T. A low-resolution depth map can be upsampled through the guidance from the registered high-resolution color image. This type of method is so-called guided depth map upsampling. Among the existing methods based on Markov random field (MRF), either data-driven or model-based prior is adopted to construct the regularization term. The data-driven prior can implicitly reveal the relation between color-depth image pair by training on external data. The model-based prior provides the anisotropic smoothness constraint guided by high-resolution color image. These types of priors can complement each other to solve the ambiguity in guided depth map upsampling. An MRF-based approach is proposed that takes both of them into account to regularize the depth map. Based on analysis sparse coding, the data-driven prior is defined by joint cosparsity on the vectors transformed from color-depth patches using the pair of learned operators. It is based on the assumption that the cosupports of such bimodal image structures computed by the operators are aligned. The edge inconsistency measurement is explicitly calculated, which is embedded into the model-based prior. It can significantly mitigate texture-copying artifacts. The experimental results on Middlebury datasets demonstrate the validity of the proposed method that outperforms seven state-of-the-art approaches.
Zuo, Y, Wu, Q, Zhang, J & An, P 2018, 'Explicit Edge Inconsistency Evaluation Model for Color-Guided Depth Map Enhancement', IEEE Transactions on Circuits and Systems for Video Technology, vol. 28, no. 2, pp. 439-453.
View/Download from: Publisher's site
View description>>
© 2016 IEEE. Color-guided depth enhancement is used to refine depth maps according to the assumption that the depth edges and the color edges at the corresponding locations are consistent. In methods on such low-level vision tasks, the Markov random field (MRF), including its variants, is one of the major approaches that have dominated this area for several years. However, the assumption above is not always true. To tackle the problem, the state-of-the-art solutions are to adjust the weighting coefficient inside the smoothness term of the MRF model. These methods lack an explicit evaluation model to quantitatively measure the inconsistency between the depth edge map and the color edge map, so they cannot adaptively control the efforts of the guidance from the color image for depth enhancement, leading to various defects such as texture-copy artifacts and blurring depth edges. In this paper, we propose a quantitative measurement on such inconsistency and explicitly embed it into the smoothness term. The proposed method demonstrates promising experimental results compared with the benchmark and state-of-the-art methods on the Middlebury ToF-Mark, and NYU data sets.
Zuo, Y, Wu, Q, Zhang, J & An, P 2018, 'Minimum Spanning Forest With Embedded Edge Inconsistency Measurement Model for Guided Depth Map Enhancement', IEEE Transactions on Image Processing, vol. 27, no. 8, pp. 4145-4159.
View/Download from: Publisher's site
View description>>
© 1992-2012 IEEE. Guided depth map enhancement based on Markov random field (MRF) normally assumes edge consistency between the color image and the corresponding depth map. Under this assumption, the low-quality depth edges can be refined according to the guidance from the high-quality color image. However, such consistency is not always true, which leads to texture-copying artifacts and blurring depth edges. In addition, the previous MRF-based models always calculate the guidance affinities in the regularization term via a non-structural scheme, which ignores the local structure on the depth map. In this paper, a novel MRF-based method is proposed. It computes these affinities via the distance between pixels in a space consisting of the minimum spanning trees (forest) to better preserve depth edges. Furthermore, inside each minimum spanning tree, the weights of edges are computed based on the explicit edge inconsistency measurement model, which significantly mitigates texture-copying artifacts. To further tolerate the effects caused by noise and better preserve depth edges, a bandwidth adaption scheme is proposed. Our method is evaluated for depth map super-resolution and depth map completion problems on synthetic and real data sets, including Middlebury, ToF-Mark, and NYU. A comprehensive comparison against 16 state-of-the-art methods is carried out. Both qualitative and quantitative evaluations present the improved performances.
Abdollahi, M, Gao, X, Mei, Y, Ghosh, S & Li, J 1970, 'Uncovering Discriminative Knowledge-Guided Medical Concepts for Classifying Coronary Artery Disease Notes', Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Australasian Joint Conference on Artificial Intelligence, Springer International Publishing, Wellington, New Zealand, pp. 104-110.
View/Download from: Publisher's site
View description>>
© Springer Nature Switzerland AG 2018. Text classification is a challenging task for allocating each document to the correct predefined class. Most of the time, there are irrelevant features which make noise in the learning step and reduce the precision of prediction. Hence, more efficient methods are needed to select or extract meaningful features to avoid noise and overfitting. In this work, an ontology-guided method utilizing the taxonomical structure of the Unified Medical Language System (UMLS) is proposed. This method extracts concepts of appeared phrases in the documents which relate to diseases or symptoms as features. The efficiency of this method is evaluated on the 2010 Informatics for Integrating Biology and the Bedside (i2b2) data set. The obtained experimental results show significant improvement by the proposed ontology-based method on the accuracy of classification.
Abdulkareem, SA, Augustijn, EW, Musial, K, Mustafa, YT & Filatova, T 1970, 'The impact of social versus individual learning for agents' risk perception during epidemics', Proceedings - IEEE 14th International Conference on eScience, e-Science 2018, 2018 IEEE 14th International Conference on e-Science (e-Science), IEEE, Amsterdam, Netherlands, pp. 297-298.
View/Download from: Publisher's site
View description>>
© 2018 IEEE. Epidemics have always been a source of concern to people, both at the individual and government level. To fight outbreaks effectively, we need advanced tools that enable us to understand the factors that influence the spread of life-threatening diseases.
Akbar, MS & Gabrys, B 1970, 'Data Analytics Enhanced Data Visualization and Interrogation with Parallel Coordinates Plots', 2018 26th International Conference on Systems Engineering (ICSEng), 2018 26th International Conference on Systems Engineering (ICSEng), IEEE, Sydney, Australia, Australia.
View/Download from: Publisher's site
View description>>
© 2018 IEEE. Parallel coordinates plots (PCPs) suffer from curse of dimensionality when used with larger multidimensional datasets. Curse of dimentionality results in clutter which hides important visual data trends among coordinates. A number of solutions to address this problem have been proposed including filtering, aggregation, and dimension reordering. These solutions, however, have their own limitations with regard to exploring relationships and trends among the coordinates in PCPs. Correlation based coordinates reordering techniques are among the most popular and have been widely used in PCPs to reduce clutter, though based on the conducted experiments, this research has identified some of their limitations. To achieve better visualization with reduced clutter, we have proposed and evaluated dimensions reordering approach based on minimization of the number of crossing pairs. In the last step, k-means clustering is combined with reordered coordinates to highlight key trends and patterns. The conducted comparative analysis have shown that minimum crossings pairs approach performed much better than other applied techniques for coordinates reordering, and when combined with k-means clustering, resulted in better visualization with significantly reduced clutter.
Alfaro-Garcia, VG, Merigo, JM, Plata-Perez, L & Calderon, GGA 1970, 'On Ordered Weighted Logarithmic Averaging Operators and Distance Measures', 2018 IEEE Symposium Series on Computational Intelligence (SSCI), 2018 IEEE Symposium Series on Computational Intelligence (SSCI), IEEE, Bangalore, India, pp. 1472-1477.
View/Download from: Publisher's site
View description>>
© 2018 IEEE. In this paper we perform an in-depth description of the main properties and families of the introduced ordered weighted logarithmic averaging distance (OWLAD) operator, the generalized ordered weighted averaging distance (GWLAD) operator, and the generalized ordered weighted logarithmic averaging distance (GOWLAD) operator. These operators have as foundation the well-known Hamming distance measure and the generalized ordered weighted logarithmic averaging (GOWLA) operator. Furthermore, we analyze multiple classical measures to characterize the operators' weighting vectors and we present alternative formulations of the operators based on the ordering of the arguments.
Alkalbani, AM & Hussain, FK 1970, 'Quality CloudCrowd: A Crowdsourcing Platform for QoS Assessment of SaaS Services', Springer International Publishing, pp. 235-240.
View/Download from: Publisher's site
View description>>
The adoption of Software as a Service (SaaS) has grown rapidly since 2010, and the need for Quality of Service (QoS) information is a significant factor in selecting a trustworthy SaaS service. In the existing literature, little attention has been given to providing QoS information with the SaaS service offering. SaaS providers offer a description of the overall QoS and service performance when they make their service offer; however service user satisfaction is a crucial factor in service selection decision-making. Crowd sourcing has grown in popularity over the last few years for performing tasks such as product design and consumer feedback, in particular, attracts the researchers in the field of client feedback on services or products. In this paper, we propose a novel framework for providing missing QoS values to the cloud marketplace called “Quality CloudCrowd”. Our proposed framework comprises of several parts; however, the development of the QCC platform for collecting missing QoS values is the core element of this structure and is the focus of this paper.
Almasoud, AS, Eljazzar, MM & Hussain, F 1970, 'Toward a Self-Learned Smart Contracts', 2018 IEEE 15th International Conference on e-Business Engineering (ICEBE), 2018 IEEE 15th International Conference on e-Business Engineering (ICEBE), IEEE, Xi'an, China, pp. 269-273.
View/Download from: Publisher's site
View description>>
© 2018 IEEE. In recent years, Blockchain technology has been highly valued and disruptive. Several researches have presented a merge between blockchain and current application i.e. Medical, supply chain, and e-commerce. Although Blockchain architecture does not have a standard yet, IBM, MS, AWS offer BaaS (Blockchain as a Service). In addition to the current public chains i.e. Ethereum, NEO, and Cardeno; there are some differences between several public ledgers in terms of development and architecture. This paper introduces the main factors that affect integration of Artificial Intelligence with Blockchain. As well as, how it could be integrated for forecasting and automating; building self-regulated chain.
Alshehri, MD & Hussain, FK 1970, 'A Centralized Trust Management Mechanism for the Internet of Things (CTM-IoT)', Advances on Broad-Band Wireless Computing, Communication and Applications, International Conference on Broad-Band Wireless Computing, Communication and Applications, Springer International Publishing, Barcelona, Spain, pp. 533-543.
View/Download from: Publisher's site
View description>>
The Internet of Things (IoT) is an extended network that allows all devices to be connected to one another over the Internet. This new network faces numerous challenges, but mainly security issues. One such issue is how the IoT’s nodes can trust each other when they are connected over the Internet. There is a lack of studies that address the issue of trust management in IoT, or that provide a fully trustworthy framework. This paper proposes and delivers a centralized trust management mechanism for IoT by adding trust modules as a feature of the central trust manager, the Super Node (SN). To deliver a comprehensive approach, the SN includes other modules which are integrated with the whole IoT Trust Management framework to provide trustworthy communication between all nodes.
Anaissi, A, Braytee, A & Naji, M 1970, 'Gaussian Kernel Parameter Optimization in One-Class Support Vector Machines', 2018 International Joint Conference on Neural Networks (IJCNN), 2018 International Joint Conference on Neural Networks (IJCNN), IEEE, Rio de Janeiro, Brazil.
View/Download from: Publisher's site
View description>>
© 2018 IEEE. The one-class support vector machines with Gaussian kernel function is a promising machine learning method which have been employed extensively in the area of anomaly detection. However, generalization performance of OCSVM is profoundly influenced by its Gaussian model parameter σ. This paper proposes a new algorithm named Edged Support Vector (ESV) for tuning the Gaussian model parameter. The semantic idea of this algorithm is based on inspecting the spatial locations of the selected support vector samples. The algorithm selects the optimal value of σ which leads to a decision boundary that has all its support vectors reside on the surface of the training data (i.e. Edged support vector). A support vector is identified as an edge sample by constructing a hyperplane with its k-nearest neighbour samples using a hard margin linear support vector machine. The algorithm was successfully validated using two real world sensing datasets, one collected from a lab specimen which was replicated a jack arch from the Sydney Harbour Bridge, and another one collected from sensors mounted on vehicles for road condition assessment. Results show that the designed ESV algorithm is an appropriate choice to identify the optimal value of σ for OCSVM.
Biddle, R, Liu, S & Xu, G 1970, 'Semi-Supervised Soft K-means Clustering of Life Insurance Questionnaire Responses', 2018 5TH INTERNATIONAL CONFERENCE ON BEHAVIORAL, ECONOMIC, AND SOCIO-CULTURAL COMPUTING (BESC), 5th International Conference on Behavioral, Economic, and Socio-Cultural Computing (BESC), IEEE, PEOPLES R CHINA, Natl Univ Kaohsiung, Kaohsiung, pp. 30-31.
View/Download from: Publisher's site
Biddle, R, Liu, S, Tilocca, P & Xu, G 1970, 'Automated Underwriting in Life Insurance: Predictions and Optimisation', Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Australasian Database Conference, Springer International Publishing, Gold Coast, QLD, Australia, pp. 135-146.
View/Download from: Publisher's site
View description>>
© Springer International Publishing AG, part of Springer Nature 2018. Underwriting is an important stage in the life insurance process and is concerned with accepting individuals into an insurance fund and on what terms. It is a tedious and labour-intensive process for both the applicant and the underwriting team. An applicant must fill out a large survey containing thousands of questions about their life. The underwriting team must then process this application and assess the risks posed by the applicant and offer them insurance products as a result. Our work implements and evaluates classical data mining techniques to help automate some aspects of the process to ease the burden on the underwriting team as well as optimise the survey to improve the applicant experience. Logistic Regression, XGBoost and Recursive Feature Elimination are proposed as techniques for the prediction of underwriting outcomes. We conduct experiments on a dataset provided by a leading Australian life insurer and show that our early-stage results are promising and serve as a foundation for further work in this space.
Braytee, A, Anaissi, A & Kennedy, PJ 1970, 'Sparse Feature Learning Using Ensemble Model for Highly-Correlated High-Dimensional Data', Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), International Conference on Neural Information Processing, Springer International Publishing, Siem Reap, Cambodia, pp. 423-434.
View/Download from: Publisher's site
View description>>
© Springer Nature Switzerland AG 2018. High-dimensional highly correlated data exist in several domains such as genomics. Many feature selection techniques consider correlated features as redundant and therefore need to be removed. Several studies investigate the interpretation of the correlated features in domains such as genomics, but investigating the classification capabilities of the correlated feature groups is a point of interest in several domains. In this paper, a novel method is proposed by integrating the ensemble feature ranking and co-expression networks to identify the optimal features for classification. The main advantage of the proposed method lies in the fact, that it does not consider the correlated features as redundant. But, it shows the importance of the selected correlated features to improve the performance of classification. A series of experiments on five high dimensional highly correlated datasets with different levels of imbalance ratios show that the proposed method outperformed the state-of-the-art methods.
Brownlow, J, Chu, C, Fu, B, Xu, G, Culbert, B & Meng, Q 1970, 'Cost-Sensitive Churn Prediction in Fund Management Services', Database Systems for Advanced Applications (LNCS), International Conference on Database Systems for Advanced Applications, Springer International Publishing, Gold Coast, QLD, Australia, pp. 776-788.
View/Download from: Publisher's site
View description>>
© Springer International Publishing AG, part of Springer Nature 2018. Churn prediction is vital to companies as to identify potential churners and prevent losses in advance. Although it has been addressed as a classification task and a variety of models have been employed in practice, fund management services have presented several special challenges. One is that financial data is extremely imbalanced since only a tiny proportion of customers leave every year. Another is a unique cost-sensitive learning problem, i.e., costs of wrong predictions for churners should be related to their account balances, while costs of wrong predictions for non-churners should be the same. To address these issues, this paper proposes a new churn prediction model based on ensemble learning. In our model, multiple classifiers are built using sampled datasets to tackle the imbalanced data issue while exploiting data fully. Moreover, a novel sampling strategy is proposed to deal with the unique cost-sensitive issue. This model has been deployed in one of the leading fund management institutions in Australia, and its effectiveness has been fully validated in real applications.
Brownlow, J, Chu, C, Fu, B, Xu, G, Culbert, B & Meng, Q 1970, 'Cost-Sensitive Churn Prediction in Fund Management Services', DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2018), PT II, 23rd International Conference on Database Systems for Advanced Applications (DASFAA)., SPRINGER INTERNATIONAL PUBLISHING AG, Gold Coast, AUSTRALIA, pp. 776-788.
View/Download from: Publisher's site
Brownlow, J, Chu, C, Xu, G, Culbert, B, Fu, B & Meng, AQ 1970, 'A Multiple Source based Transfer Learning Framework for Marketing Campaigns', Proceedings of the International Joint Conference on Neural Networks, 2018 International Joint Conference on Neural Networks (IJCNN), IEEE, Rio De Janeiro, Brasil.
View/Download from: Publisher's site
View description>>
© 2018 IEEE. The rapid growing number of marketing campaigns demands an efficient learning model to identify prospective customers to target. Transfer learning is widely considered as a major way to improve the learning performance by using the generated knowledge from previous learning tasks. Most recent studies focused on transferring knowledge from source domains to target domains which may result in knowledge missing. To avoid this, we proposed a multiple source based transfer learning framework to do it reversely. The data in target domains is transferred into source domains by normalizing them into the same distributions and then improving the learning task in target domains by its generated knowledge in source domains. The proposed method is general and can deal with supervised and unsupervised inductive and transductive learning simultaneously with a compatibility to work with different machine learning models. The experiments on real-world campaign data demonstrate the performance of the proposed method.
Butler, A, Xu, G & Musial, K 1970, 'Research Performance Reporting is Fallacious', Proceedings - 2018 5th International Conference on Behavioral, Economic, and Socio-Cultural Computing, BESC 2018, 2018 5th International Conference on Behavioral, Economic, and Socio-Cultural Computing (BESC), IEEE, Taiwan, pp. 1-5.
View/Download from: Publisher's site
View description>>
© 2018 IEEE. Citation-based research performance reporting is contentious. The methods used to categorize research and researchers are misleading and somewhat arbitrary. This paper compares cohorts of social science categorized citation data and ultimately shows that assumptions of comparability are spurious. A subject area comparison using research field distributions and networks between a 'reference author', bibliographically coupled data, keyword-obtained data, social science data and highly cited social science author data shows very dissimilar field foci with one dataset very much being medically focused. This leads to the question whether subject area classifications should continue to be used as the basis for the plethora of rankings and lists that use such groupings. It is suggested that bibliographic coupling and dynamic topic classifiers would better inform citation data comparisons.
Butler, A, Xu, G & Musial, K 1970, 'Research Performance Reporting is Fallacious', 2018 5TH INTERNATIONAL CONFERENCE ON BEHAVIORAL, ECONOMIC, AND SOCIO-CULTURAL COMPUTING (BESC), 5th International Conference on Behavioral, Economic, and Socio-Cultural Computing (BESC), IEEE, Natl Univ Kaohsiung, Kaohsiung, PEOPLES R CHINA, pp. 1-5.
View/Download from: Publisher's site
Cao, G, Downes, A, Khan, S, Wong, W & Xu, G 1970, 'Taxpayer Behavior Prediction in SMS Campaigns', Proceedings - 2018 5th International Conference on Behavioral, Economic, and Socio-Cultural Computing, BESC 2018, 2018 5th International Conference on Behavioural, Economic, and Socio-Cultural Computing, Taiwan, pp. 19-23.
View/Download from: Publisher's site
View description>>
This paper develops a prediction study of a group of small businesses which have a higher risk of non-compliance with taxation obligations. These businesses have been selected for a pre-emptive SMS reminder campaign and prediction models are used to predict the probability of on-Time payment. Through experiments on a real world taxation debt dataset, it is found that the XGBoost algorithm significantly outperforms random forest, decision tree and logistic regression algorithms. The variables showing the largest explanatory power are related to debt amount. Second and subsequent SMS messages make a negligible contribution to the probability of payment. The XGBoost explainer is also used to delve further into the inner workings of the algorithm.
Cao, G, Downes, A, Khan, S, Wong, W & Xu, G 1970, 'Taxpayer Behavior Prediction in SMS Campaigns', 2018 5TH INTERNATIONAL CONFERENCE ON BEHAVIORAL, ECONOMIC, AND SOCIO-CULTURAL COMPUTING (BESC), 5th International Conference on Behavioral, Economic, and Socio-Cultural Computing (BESC), IEEE, Natl Univ Kaohsiung, Kaohsiung, PEOPLES R CHINA, pp. 19-23.
View/Download from: Publisher's site
Cao, G, Downes, A, Khan, S, Wong, W & Xu, G 1970, 'Taxpayer Behavior Prediction in SMS Campaigns', 2018 5th International Conference on Behavioral, Economic, and Socio-Cultural Computing (BESC), 2018 5th International Conference on Behavioral, Economic, and Socio-Cultural Computing (BESC), IEEE.
View/Download from: Publisher's site
Chemalamarri, VD, Braun, R, Lipman, J & Abolhasan, M 1970, 'A Multi-agent Controller to enable Cognition in Software Defined Networks', 2018 28th International Telecommunication Networks and Applications Conference (ITNAC), 2018 28th International Telecommunication Networks and Applications Conference (ITNAC), IEEE, Sydney, Australia, pp. 52-56.
View/Download from: Publisher's site
View description>>
© 2018 IEEE. Current SDN controllers are not cognitive. We propose a new architecture for an SDN controller to enable intelligence. The proposed new architecture is based on Multi-agent systems. As a prototype, we have built a MAS-SDN controller using the GOAL agent programming language. We highlight the motivation behind the new architecture, describe the architecture and provide some initial results.
Chen, X, Liu, F, Tu, E, Cao, L & Yang, J 1970, 'Deep-PUMR: Deep Positive and Unlabeled Learning with Manifold Regularization', Lecture Notes in Computer Science, International Conference on Neural Information Processing, Springer International Publishing, Siem Reap, Cambodia, pp. 12-20.
View/Download from: Publisher's site
View description>>
Training a binary classifier only on positive and unlabeled examples (i.e., the PU learning) is an important yet challenging issue, widely seen in many problems in which it is difficult to obtain negative examples. Existing methods for handling this challenge often perform unsatisfactorily, since they often ignore the relations between positive and unlabeled examples and are also limited to the traditional shallow learning frameworks. Therefore, this work proposes a new approach: Deep Positive and Unlabeled learning with Manifold Regularization (Deep-PUMR), which integrates the manifold regularization with deep neural networks to address the above issues with classic PU learning. Deep-PUMR holds two major advantages: (i) Our method exploits the manifold properties of data distribution to capture the relationship of positive and unlabeled examples; (ii) The adopted deep network enables Deep-PUMR with strong learning ability, especially on large-scale datasets. Extensive experiments on five diverse datasets demonstrate that Deep-PUMR achieves the state-of-the-art performance in comparison with classic PU learning algorithms and risk estimators.
Cobo, MJ, Wang, W, Laengle, S, Merigó, JM, Yu, D & Herrera-Viedma, E 1970, 'Co-words Analysis of the Last Ten Years of the International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems', Communications in Computer and Information Science, Springer International Publishing, Cádiz, Spain, pp. 667-677.
View/Download from: Publisher's site
View description>>
© Springer International Publishing AG, part of Springer Nature 2018. The main aim of this contribution is to develop a co-words analysis of the International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems in the last ten years (2008–2017). The software tool SciMAT is employed using an approach that allows us to uncover the main research themes and analyze them according to their performance measures (qualitative and quantitative). An amount of 562 documents were retrieved from the Web of Science. The corpus was divided into two consecutive periods (2008–2012 and 2013–2017). Our key findings are that the most important research themes in the first and second period were devoted with decision making process and its related aspects, techniques and methods.
Cunill, OM, Gil-Lafuente, AM, Merigó, JM & González, LO 1970, 'Academic Contributions in Asian Tourism Research: A Bibliometric Analysis', Advances in Intelligent Systems and Computing, International Conference of the ‘Forum for Interdisciplinary Mathematics', Springer International Publishing, Spain, pp. 326-342.
View/Download from: Publisher's site
View description>>
© Springer International Publishing AG, part of Springer Nature 2018. Bibliometrics is a fundamental field of information science that helps to draw quantitative conclusions about bibliographic material. During the last decade, the use of techniques and bibliometric studies has experienced a significant increase due to the improvement of information technology and its usefulness to organize knowledge in a scientific discipline. This paper presents an overview of the most productive and influential Asian universities and countries in academic tourism research through the use of bibliometric indicators, according to information found in the database Web of Science (WoS). This database is considered one of the main tools for the analysis of scientific information. In order to analyze the information obtained, several rankings of universities and countries have been carried out, both global and individual, based on a series of bibliometric indicators, such as the number of publications, the number of citations and h-index. Analyzing the results, we observe that within tourism research in Asia, the most influential countries are China, Taiwan and South Korea, and that the leading university is Hong Kong Polytechnic University.
Cunill, OM, Gil-Lafuente, AM, Merigó, JM & González, LO 1970, 'Asian Academic Research in Tourism with an International Impact: A Bibliometric Analysis of the Main Academic Contributions', Advances in Intelligent Systems and Computing, International Conference of the ‘Forum for Interdisciplinary Mathematics', Springer International Publishing, Spain, pp. 307-325.
View/Download from: Publisher's site
View description>>
© Springer International Publishing AG, part of Springer Nature 2018. Asian academic research in tourism is a very recent field of research, which has significantly developed over the last decade due to the strong expansion of the tourism industry worldwide, and also owing to the strong evolution of search engines via the Internet. This article analyses the main contributions to Asian academic research in tourism over recent years using bibliometric indicators. The results obtained are based on the information contained in the Web of Science database. These results focus on explaining three fundamental questions. Firstly, we study the publication structure of Asian articles in tourism over recent decades, as well as the citations these articles have received. Secondly, we present a ranking of the most important tourism journals in Asia through the use of a series of indicators such as the number of publications in said journals, the number of citations, and the h-index. Finally, we present a list of the 50 most cited Asian articles in tourism (and hence the ones that can be considered the most influential) of all times. The results show how, in Asian terms, the most influential journals in this field are Tourism Management (TM), the Annals of Tourism Research (ATR) and the International Journal of Hospitality Management (IJHM).
Do, TDT & Cao, L 1970, 'Coupled Poisson Factorization Integrated With User/Item Metadata for Modeling Popular and Sparse Ratings in Scalable Recommendation', Proceedings of the AAAI Conference on Artificial Intelligence, AAAI Conference on Artificial Intelligence, Association for the Advancement of Artificial Intelligence (AAAI), New Orleans, USA, pp. 2918-2925.
View/Download from: Publisher's site
View description>>
Modelling sparse and large data sets is highly in demand yet challenging in recommender systems. With the computation only on the non-zero ratings, Poisson Factorization (PF) enabled by variational inference has shown its high efficiency in scalable recommendation, e.g., modeling millions of ratings. However, as PF learns the ratings by individual users on items with the Gamma distribution, it cannot capture the coupling relations between users (items) and the rating popularity (i.e., favorable rating scores that are given to one item) and rating sparsity (i.e., those users (items) with many zero ratings) for one item (user). This work proposes Coupled Poisson Factorization (CPF) to learn the couplings between users (items), and the user/item attributes (i.e., metadata) are integrated into CPF to form the Metadata-integrated CPF (mCPF) to not only handle sparse but also popular ratings in very large-scale data. Our empirical results show that the proposed models significantly outperform PF and address the key limitations in PF for scalable recommendation.
Do, TDT & Cao, L 1970, 'Metadata-dependent Infinite Poisson Factorization for Efficiently Modelling Sparse and Large Matrices in Recommendation', Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, Twenty-Seventh International Joint Conference on Artificial Intelligence {IJCAI-18}, International Joint Conferences on Artificial Intelligence Organization, Stockholm, Sweden, pp. 5010-5016.
View/Download from: Publisher's site
View description>>
Matrix Factorization (MF) is widely used in Recommender Systems (RSs) for estimating missing ratings in the rating matrix. MF faces major challenges of handling very sparse and large data. Poisson Factorization (PF) as an MF variant addresses these challenges with high efficiency by only computing on those non-missing elements. However, ignoring the missing elements in computation makes PF weak or incapable for dealing with columns or rows with very few observations (corresponding to sparse items or users). In this work, Metadata-dependent Poisson Factorization (MPF) is invented to address the user/item sparsity by integrating user/item metadata into PF. MPF adds the metadata-based observed entries to the factorized PF matrices. In addition, similar to MF, choosing the suitable number of latent components for PF is very expensive on very large datasets. Accordingly, we further extend MPF to Metadata-dependent Infinite Poisson Factorization (MIPF) that integrates Bayesian Nonparametric (BNP) technique to automatically tune the number of latent components. Our empirical results show that, by integrating metadata, MPF/MIPF significantly outperform the state-of-the-art PF models for sparse and large datasets. MIPF also effectively estimates the number of latent components.
Gamal, M, Abolhasan, M, Lipman, J, Liu, RP & Ni, W 1970, 'Multi Objective Resource Optimisation for Network Function Virtualisation Requests', 2018 26th International Conference on Systems Engineering (ICSEng), 2018 26th International Conference on Systems Engineering (ICSEng), IEEE, Sydney, Australia.
View/Download from: Publisher's site
View description>>
© 2018 IEEE. Network function vitalization (NFV) as a new research concept, for both academia and industry, faces many challenges to network operators before it can be accepted into mainstream. One challenge addressed in this paper is to find the optimal placement f or a set of incoming requests with VNF service chains to serve in suitable Virtual Machines (VMs) such that a set of conflicting objectives are met. Mainly, focus is placed on maximizing the total saving cost by increasing the total CPU utilization during the processing time and increasing the processing time for every service request in the cloud network. Moreover, we aim to maximize the admitted traffic simultaneously while considering the system constraints. We formulate the problem as a multi-objective optimization problem and use a Resource Utilization Multi-Objective Evolutionary Algorithm based on Decomposition (RU-MOEA/D) algorithm to solve the problem considering the two objectives simultaneously. Extensive simulations are carried out to evaluate the effects of the different network sizes, genetic parameters and the number of server resources on the acceptable ratio of the arrival chains to serve in the available VMs. The empirical results illustrate that the proposed algorithm can solve the problem efficiently and compute the optimal solution for two objectives together within a reasonable running time.
Gheisari, S, Catchpoole, DR, Charlton, A & Kennedy, PJ 1970, 'Patched Completed Local Binary Pattern is an Effective Method for Neuroblastoma Histological Image Classification', Communications in Computer and Information Science, Australasian Conference on Data Mining, Springer Singapore, Bathurst, NSW, Australia, pp. 57-71.
View/Download from: Publisher's site
View description>>
© Springer Nature Singapore Pte Ltd. 2018. Neuroblastoma is the most common extra cranial solid tumour in children. The histology of neuroblastoma has high intra-class variation, which misleads existing computer-aided histological image classification methods that use global features. To tackle this problem, we propose a new Patched Completed Local Binary Pattern (PCLBP) method combining Sign Binary Pattern (SBP) and Magnitude Binary Pattern (MBP) within local patches to build feature vectors which are classified by k-Nearest Neighbor (k-NN) and Support Vector Machine (SVM) classifiers. The advantage of our method is extracting local features which are more robust to intra-class variation compared to global ones. We gathered a database of 1043 histologic images of neuroblastic tumours classified into five subtypes. Our experiments show the proposed method improves the weighted average F-measure by 1.89% and 0.81% with k-NN and SVM classifiers, respectively.
Gui, M, Zhang, Z, Yang, Z, Gu, Y & Xu, G 1970, 'An Effective Joint Framework for Document Summarization', Companion of the The Web Conference 2018 on The Web Conference 2018 - WWW '18, Companion of the The Web Conference 2018, ACM Press, pp. 121-122.
View/Download from: Publisher's site
View description>>
© 2018 IW3C2 (International World Wide Web Conference Committee), published under Creative Commons CC BY 4.0 License. Document summarization is an important research issue and has attracted much attention from the academe. The approaches for document summarization can be classified as extractive and abstractive. In this work, we introduce an effective joint framework that integrates extractive and abstractive summarization models, which is much closer to the way human write summaries (first underlining important information). Preliminary experiments on real benchmark dataset demonstrate that our model is competitive with the state-of-the-art methods.
Guo, D, Zhao, W, Cui, Y, Wang, Z, Chen, S & Zhang, J 1970, 'Siamese Network Based Features Fusion for Adaptive Visual Tracking', Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), International Conference on Artificial Intelligence, Springer International Publishing, China, pp. 759-771.
View/Download from: Publisher's site
View description>>
© Springer Nature Switzerland AG 2018. Visual object tracking is a popular but challenging problem in computer vision. The main challenge is the lack of priori knowledge of the tracking target, which may be only supervised of a bounding box given in the first frame. Besides, the tracking suffers from many influences as scale variations, deformations, partial occlusions and motion blur, etc. To solve such a challenging problem, a suitable tracking framework is demanded to adopt different tracking scenes. This paper presents a novel approach for robust visual object tracking by multiple features fusion in the Siamese Network. Hand-crafted appearance features and CNN features are combined to mutually compensate for their shortages and enhance the advantages. The proposed network is processed as follows. Firstly, different features are extracted from the tracking frames. Secondly, the extracted features are employed via Correlation Filter respectively to learn corresponding templates, which are used to generate response maps respectively. And finally, the multiple response maps are fused to get a better response map, which can help to locate the target location more accurately. Comprehensive experiments are conducted on three benchmarks: Temple-Color, OTB50 and UAV123. Experimental results demonstrate that the proposed approach achieves state-of-the-art performance on these benchmarks.
Hayati, H, Walker, P, Brown, T, Kennedy, P & Eager, D 1970, 'A Simple Spring-Loaded Inverted Pendulum (SLIP) Model of a Bio-Inspired Quadrupedal Robot Over Compliant Terrains', Volume 4B: Dynamics, Vibration, and Control, ASME 2018 International Mechanical Engineering Congress and Exposition, American Society of Mechanical Engineers, USA.
View/Download from: Publisher's site
View description>>
To study the impact of compliant terrains on the biomechanics of rapid legged movements, a well-known spring loaded inverted pendulum (SLIP) model is deployed. The model is a three-degrees-of-freedom system (3 DOF), inspired by galloping greyhounds competing in a racing condition. A single support phase of hind-leg stance in a galloping gait is taken into consideration due to its primary function in powering the greyhounds locomotion and higher rate of musculoskeletal injuries. To obtain and solve the nonlinear second-order differential equation of motions, the Lagrangian method and MATLABb R2017b (ode45 solver), which is based on the Runge-Kutta method, has been used, respectively. To get the viscoelastic behavior of compliant terrains, a Clegg hammer test was developed and performed five times on each sample. The effective spring and damping coefficients of each sample were then determined from the hysteresis curves. The results showed that galloping on the synthetic rubber requires more muscle force compared with wet sand. However, according to the Clegg hammer test, wet sand had a higher impact force than synthetic rubber which can be a risk factor for bone fracture, particularly hock fracture, in greyhounds. The results reported in this paper are not only useful for identifying optimum terrain properties and injury thresholds of an athletic track, but also can be used to design control methods and shock impedances for legged robots performing on compliant terrains.
Hu, L, Jian, S, Cao, L & Chen, Q 1970, 'Interpretable Recommendation via Attraction Modeling: Learning Multilevel Attractiveness over Multimodal Movie Contents', Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, Twenty-Seventh International Joint Conference on Artificial Intelligence {IJCAI-18}, International Joint Conferences on Artificial Intelligence Organization, Stockholm, Sweden, pp. 3400-3406.
View/Download from: Publisher's site
View description>>
New contents like blogs and online videos are produced in every second in the new media age. We argue that attraction is one of the decisive factors for user selection of new contents. However, collaborative filtering cannot work without user feedback; and the existing content-based recommender systems are ineligible to capture and interpret the attractive points on new contents. Accordingly, we propose attraction modeling to learn and interpret user attractiveness. Specially, we build a multilevel attraction model (MLAM) over the content features -- the story (textual data) and cast members (categorical data) of movies. In particular, we design multilevel personal filters to calculate users' attractiveness on words, sentences and cast members at different levels. The experimental results show the superiority of MLAM over the state-of-the-art methods. In addition, a case study is provided to demonstrate the interpretability of MLAM by visualizing user attractiveness on a movie.
Huang, H, Xu, J, Zhang, J, Wu, Q & Kirsch, C 1970, 'Railway Infrastructure Defects Recognition using Fine-grained Deep Convolutional Neural Networks', 2018 Digital Image Computing: Techniques and Applications (DICTA), 2018 Digital Image Computing: Techniques and Applications (DICTA), IEEE, Canberra, Australia.
View/Download from: Publisher's site
View description>>
© 2018 IEEE. Railway power supply infrastructure is one of the most important components of railway transportation. As the key step of railway maintenance system, power supply infrastructure defects recognition plays a vital role in the whole defects inspection sub-system. Traditional defects recognition task is performed manually, which is time-consuming and high-labor costing. Inspired by the great success of deep neural networks in dealing with different vision tasks, this paper presents an end-to-end deep network to solve the railway infrastructure defects detection problem. More importantly, this paper is the first work that adopts the idea of deep fine-grained classification to do railway defects detection. We propose a new bilinear deep network named Spatial Transformer And Bilinear Low-Rank (STABLR) model and apply it to railway infrastructure defects detection. The experimental results demonstrate that the proposed method outperforms both hand-craft features based machine learning methods and classic deep neural network methods.
Ikram, MA & Hussain, FK 1970, 'Software as a Service (SaaS) Service Selection Based on Measuring the Shortest Distance to the Consumer’s Preferences', Springer International Publishing, pp. 403-415.
View/Download from: Publisher's site
View description>>
Software as a Service (SaaS) is a type of cloud service that runs and operates over the Platform as a Service (PaaS), which in turn works on the Infrastructure as a Service (IaaS). In the past few years, there has been an enormous growth in the number of SaaS services. It is estimated that the revenue of SaaS services will reach US$ 112.8 billion in 2019. This growth in the number of SaaS services makes the selection process difficult for a consumer who is looking to select the best service among the many services that have similar functionalities. In this article, we propose a Find SaaS framework to select a service based on measuring the shortest distance to the consumer’s preferences. In order to explain how the Find SaaS framework works, a case study based on selecting a computer repair shop’s SaaS application for the consumer has been presented.
Jauregi Unanue, I, Zare Borzeshi, E & Piccardi, M 1970, 'A Shared Attention Mechanism for Interpretation of Neural Automatic Post-Editing Systems', Proceedings of the 2nd Workshop on Neural Machine Translation and Generation, Proceedings of the 2nd Workshop on Neural Machine Translation and Generation, Association for Computational Linguistics, Melbourne, Australia, pp. 11-17.
View/Download from: Publisher's site
View description>>
Automatic post-editing (APE) systems aim to correct the systematic errors made by machine translators. In this paper, we propose a neural APE system that encodes the source (src) and machine translated (mt) sentences with two separate encoders, but leverages a shared attention mechanism to better understand how the two inputs contribute to the generation of the post-edited (pe) sentences. Our empirical observations have showed that when the mt is incorrect, the attention shifts weight toward tokens in the src sentence to properly edit the incorrect translation. The model has been trained and evaluated on the official data from the WMT16 and WMT17 APE IT domain English-German shared tasks. Additionally, we have used the extra 500K artificial data provided by the shared task. Our system has been able to reproduce the accuracies of systems trained with the same data, while at the same time providing better interpretability.
Jian, S, Hu, L, Cao, L & Lu, K 1970, 'Metric-based auto-instructor for learning mixed data representation', 32nd AAAI Conference on Artificial Intelligence, AAAI 2018, AAAI Conference on Artificial Intelligence, AAAI, New Orleans, USA, pp. 3318-3325.
View description>>
Mixed data with both categorical and continuous features are ubiquitous in real-world applications. Learning a good representation of mixed data is critical yet challenging for further learning tasks. Existing methods for representing mixed data often overlook the heterogeneous coupling relationships between categorical and continuous features as well as the discrimination between objects. To address these issues, we propose an auto-instructive representation learning scheme to enable margin-enhanced distance metric learning for a discrimination-enhanced representation. Accordingly, we design a metric-based auto-instructor (MAI) model which consists of two collaborative instructors. Each instructor captures the feature-level couplings in mixed data with fully connected networks, and guides the infinite-margin metric learning for the peer instructor with a contrastive order. By feeding the learned representation into both partition-based and density-based clustering methods, our experiments on eight UCI datasets show highly significant learning performance improvement and much more distinguishable visualization outcomes over the baseline methods.
Jiang, X, Pan, S, Long, G, Chang, J, Jiang, J & Zhang, C 1970, 'Cost-sensitive Hybrid Neural Networks for Heterogeneous and Imbalanced Data', 2018 International Joint Conference on Neural Networks (IJCNN), 2018 International Joint Conference on Neural Networks (IJCNN), IEEE, Rio de Janeiro, Brazil, pp. 1-8.
View/Download from: Publisher's site
View description>>
Analyzing accumulated data has recently attracted huge attention for its ability to generate values by identifying useful information and providing an edge in global business competition. However, heterogeneous data and imbalanced class distribution present two major challenges to machine learning with real-world business data. Traditional machine learning algorithms can typically only be applied to standard data sets, which are normally homogeneous and balanced. These algorithms narrow complex data into a homogeneous, a balanced data space an inefficient process that requires a significant amount of pre-processing. In this paper, we focus on an efficient solution to the challenges with heterogeneous and imbalanced data sets that does not require pre-processing. Our approach comprises a novel, unified, end-to-end cost-sensitive hybrid neural network that learns real-world heterogeneous data via a parallel network architecture. A specifically-designed cost-sensitive matrix then automatically generates a robust model for learning minority classifications. And the parameters of both the cost-sensitive matrix and the hybrid neural network are alternately but jointly optimized during training. The results of comparative experiments on six real-world data sets reflecting actual business cases, including insurance fraud detection and mobile customer demographics, indicate that the proposed approach demonstrates superior performance over baseline procedures.
Jin, D, Liu, Z, He, D, Gabrys, B & Musial, K 1970, 'Robust Detection of Communities with Multi-semantics in Large Attributed Networks', Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), International Conference on Knowledge Science, Engineering and Management, Springer International Publishing, Changchun, China, pp. 362-376.
View/Download from: Publisher's site
View description>>
© 2018, Springer Nature Switzerland AG. In this paper, we are interested in how to explore and utilize the relationship between network communities and semantic topics in order to find the strong explanatory communities robustly. First, the relationship between communities and topics displays different situations. For example, from the viewpoint of semantic mapping, their relationship can be one-to-one, one-to-many or many-to-one. But from the standpoint of underlying community structures, the relationship can be consistent, partially consistent or completely inconsistent. Second, it will be helpful to not only find communities more precise but also reveal the communities’ semantics that shows the relationship between communities and topics. To better describe this relationship, we introduce the transition probability which is an important concept in Markov chain into a well-designed nonnegative matrix factorization framework. This new transition probability matrix with a suitable prior which plays the role of depicting the relationship between communities and topics can perform well in this task. To illustrate the effectiveness of the proposed new approach, we conduct some experiments on both synthetic and real networks. The results show that our new method is superior to baselines in accuracy. We finally conduct a case study analysis to validate the new method’s strong interpretability to detected communities.
Lan, C, Peng, H, McGowan, EM, Hutvagner, G & Li, J 1970, 'An isomIR expression panel based novel breast cancer classification approach using improved mutual information', International Conference on Genome Informatics, International Conference on Genome Informatics, Kunming, Yunnan, China.
Li, C, Deng, C, Li, N, Liu, W, Gao, X & Tao, D 1970, 'Self-Supervised Adversarial Hashing Networks for Cross-Modal Retrieval', 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, pp. 4242-4251.
View/Download from: Publisher's site
View description>>
© 2018 IEEE. Thanks to the success of deep learning, cross-modal retrieval has made significant progress recently. However, there still remains a crucial bottleneck: how to bridge the modality gap to further enhance the retrieval accuracy. In this paper, we propose a self-supervised adversarial hashing (SSAH) approach, which lies among the early attempts to incorporate adversarial learning into cross-modal hashing in a self-supervised fashion. The primary contribution of this work is that two adversarial networks are leveraged to maximize the semantic correlation and consistency of the representations between different modalities. In addition, we harness a self-supervised semantic network to discover high-level semantic information in the form of multi-label annotations. Such information guides the feature learning process and preserves the modality relationships in both the common semantic space and the Hamming space. Extensive experiments carried out on three benchmark datasets validate that the proposed SSAH surpasses the state-of-the-art methods.
Li, J 1970, 'Version Space Completeness for Novel Hypothesis Induction in Biomedical Applications', 2018 International Joint Conference on Neural Networks (IJCNN), 2018 International Joint Conference on Neural Networks (IJCNN), IEEE, Rio de Janeiro, Brazil.
View/Download from: Publisher's site
View description>>
© 2018 IEEE. Use of traditional discretization methods caused a heavy loss of hypotheses in the induction of version spaces. We present a new discretization method, named two-point discretization, to construct an interval covering all the positive data points of a variable as purely as possible. We prove that the two-point discretization is a necessary and sufficient con- dition to guarantee the completeness of version spaces (i.e., no loss of hypothesis). A linear complexity algorithm is proposed to implement these theories. The algorithm is also applied to real-world bioinformatics problems to induce significant biomedical hypotheses which have been never discovered by the traditional approaches.
Li, J, Fong, S, Hu, S, Wong, RK & Mohammed, S 1970, 'Similarity Majority Under-Sampling Technique for Easing Imbalanced Classification Problem', Communications in Computer and Information Science, Australasian Conference on Data Mining, Springer Singapore, Melbourne, VIC, Australia, pp. 3-23.
View/Download from: Publisher's site
View description>>
© Springer Nature Singapore Pte Ltd. 2018. Imbalanced classification problem is an enthusiastic topic in the fields of data mining, machine learning and pattern recognition. The imbalanced distributions of different class samples result in the classifier being over-fitted by learning too many majority class samples and under-fitted in recognizing minority class samples. Prior methods attempt to ease imbalanced problem through sampling techniques, in order to re-assign and rebalance the distributions of imbalanced dataset. In this paper, we proposed a novel notion to under-sample the majority class size for adjusting the original imbalanced class distributions. This method is called Similarity Majority Under-sampling Technique (SMUTE). By calculating the similarity of each majority class sample and observing its surrounding minority class samples, SMUTE effectively separates the majority and minority class samples to increase the recognition power for each class. The experimental results show that SMUTE could outperform the current under-sampling methods when the same under-sampling rate is used.
Li, Y, Huang, Y, Xu, R, Seneviratne, S, Thilakarathna, K, Cheng, A, Webb, D & Jourjon, G 1970, 'Deep Content: Unveiling Video Streaming Content from Encrypted WiFi Traffic', 2018 IEEE 17th International Symposium on Network Computing and Applications (NCA), 2018 IEEE 17th International Symposium on Network Computing and Applications (NCA), IEEE, Cambridge, MA, USA.
View/Download from: Publisher's site
View description>>
© 2018 IEEE. The proliferation of smart devices has led to an exponential growth in digital media consumption, especially mobile video for content marketing. The vast majority of the associated Internet traffic is now end-to-end encrypted, and while encryption provides better user privacy and security, it has made network surveillance an impossible task. The result is an unchecked environment for exploiters and attackers to distribute content such as fake, radical and propaganda videos. Recent advances in machine learning techniques have shown great promise in characterising encrypted traffic captured at the end points. However, video fingerprinting from passively listening to encrypted traffic, especially wireless traffic, has been reported as a challenging task due to the difficulty in distinguishing retransmissions and multiple flows on the same link. We show the potential of fingerprinting videos by passively sniffing WiFi frames in air, even without connecting to the WiFi network. We have developed Multi-Layer Perceptron (MLP) and Recurrent Neural Networks (RNNs) that are able to identify streamed YouTube videos from a closed set, by sniffing WiFi traffic encrypted at both Media Access Control (MAC) and Network layers. We compare these models to the state-of-the-art wired traffic classifier based on Convolutional Neural Networks (CNNs), and show that our models obtain similar results while requiring significantly less computational power and time (approximately a threefold reduction).
Li, Z, Zhang, J, Wu, Q & Kirsch, C 1970, 'Field-Regularised Factorization Machines for Mining the Maintenance Logs of Equipment', Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Australasian Joint Conference on Artificial Intelligence, Springer International Publishing, New Zealand, pp. 172-183.
View/Download from: Publisher's site
View description>>
© Springer Nature Switzerland AG 2018. Failure prediction is very important for railway infrastructure. Traditionally, data from various sensors are collected for this task. Value of maintenance logs is often neglected. Maintenance records of equipment usually indicate equipment status. They could be valuable for prediction of equipment faults. In this paper, we propose Field-regularised Factorization Machines (FrFMs) to predict failures of railway points with maintenance logs. Factorization Machine (FM) and its variants are state-of-the-art algorithms designed for sparse data. They are widely used in click-through rate prediction and recommendation systems. Categorical variables are converted to binary features through one-hot encoding and then fed into these models. However, field information is ignored in this process. We propose Field-regularised Factorization Machines to incorporate such valuable information. Experiments on data set from railway maintenance logs and another public data set show the effectiveness of our methods.
Lian, D, Zheng, K, Zheng, VW, Ge, Y, Cao, L, Tsang, IW & Xie, X 1970, 'High-order Proximity Preserving Information Network Hashing', Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD '18: The 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, London, United Kingdom, pp. 1744-1753.
View/Download from: Publisher's site
View description>>
© 2018 Association for Computing Machinery. Information network embedding is an effective way for efficient graph analytics. However, it still faces with computational challenges in problems such as link prediction and node recommendation, particularly with increasing scale of networks. Hashing is a promising approach for accelerating these problems by orders of magnitude. However, no prior studies have been focused on seeking binary codes for information networks to preserve high-order proximity. Since matrix factorization (MF) unifies and outperforms several well-known embedding methods with high-order proximity preserved, we propose a MF-based Information Network Hashing (INH-MF) algorithm, to learn binary codes which can preserve high-order proximity. We also suggest Hamming subspace learning, which only updates partial binary codes each time, to scale up INH-MF. We finally evaluate INH-MF on four real-world information network datasets with respect to the tasks of node classification and node recommendation. The results demonstrate that INH-MF can perform significantly better than competing learning to hash baselines in both tasks, and surprisingly outperforms network embedding methods, including DeepWalk, LINE and NetMF, in the task of node recommendation. The source code of INH-MF is available online1
Liu, C, Chen, L, Tsang, I & Yin, H 1970, 'Towards the Learning of Weighted Multi-label Associative Classifiers', 2018 International Joint Conference on Neural Networks (IJCNN), 2018 International Joint Conference on Neural Networks (IJCNN), IEEE, Rio de Janeiro, Brazil, pp. 1-7.
View/Download from: Publisher's site
View description>>
© 2018 IEEE. Because of the ability to capture the correlation between features and labels, association rules have been applied to multi-label classification. However, existing multi-label associative classification algorithms usually exploit association rules using heuristic strategies. Moreover, only the covering association rules whose feature set is a subset of the testing instance are considered. Discarding any mined rules may diminish the performance of the classifier, especially when some rules only differ from the testing instance by a few insignificant features. In this paper we propose Weighted Multi-label Associative Classifiers (WMAC) that leverage an extended set of association rules with overlapping features with the testing instance to learn a universal weight vector for features. For this purpose, we embed the set of rules into a linear model and weigh the association rules by its confidence. Empirical results on diversified datasets clearly demonstrate that WMAC outperforms other well-established multi-label classification algorithms.
Liu, W & Chivukula, A 1970, 'AI 2018: Advances in Artificial Intelligence', The 31st Australasian Joint Conference on Artificial Intelligence, The 31st Australasian Joint Conference on Artificial Intelligence, Springer International Publishing, Wellington, New Zealand, pp. 692-692.
View/Download from: Publisher's site
Liu, W, Chang, X, Chen, L & Yang, Y 1970, 'Semi-supervised Bayesian attribute learning for person re-identification', 32nd AAAI Conference on Artificial Intelligence, AAAI 2018, Thirty-Second AAAI Conference on Artificial Intelligence, AAAI, New Orleans, Louisiana, USA, pp. 7162-7169.
View description>>
Person re-identification (re-ID) tasks aim to identify the same person in multiple images captured from non-overlapping camera views. Most previous re-ID studies have attempted to solve this problem through either representation learning or metric learning, or by combining both techniques. Representation learning relies on the latent factors or attributes of the data. In most of these works, the dimensionality of the factors/attributes has to be manually determined for each new dataset. Thus, this approach is not robust. Metric learning optimizes a metric across the dataset to measure similarity according to distance. However, choosing the optimal method for computing these distances is data dependent, and learning the appropriate metric relies on a sufficient number of pair-wise labels. To overcome these limitations, we propose a novel algorithm for person re-ID, called semi-supervised Bayesian attribute learning. We introduce an Indian Buffet Process to identify the priors of the latent attributes. The dimensionality of attributes factors is then automatically determined by nonparametric Bayesian learning. Meanwhile, unlike traditional distance metric learning, we propose a re-identification probability distribution to describe how likely it is that a pair of images contains the same person. This technique relies solely on the latent attributes of both images. Moreover, pair-wise labels that are not known can be estimated from pair-wise labels that are known, making this a robust approach for semi-supervised learning. Extensive experiments demonstrate the superior performance of our algorithm over several state-of-the-art algorithms on small-scale datasets and comparable performance on large-scale re-ID datasets.
Liu, W, Chang, X, Chen, L & Yang, Y 1970, 'Semi-supervised Joint Learning of Representation and Relation for Person Re-identification', AAAI Conference on Artificial Intelligence, Louisiana, USA.
Merigó, JM, Herrera-Viedma, E, Cobo, MJ, Laengle, S & Rivas, D 1970, 'A Bibliometric Analysis of the First Twenty Years of Soft Computing', Proceedings of the Conference of the European Society for Fuzzy Logic and Technology, International Workshop on Intuitionistic Fuzzy Sets and Generalized Nets, Springer International Publishing, Warsaw, Poland, pp. 517-528.
View/Download from: Publisher's site
View description>>
© 2018, Springer International Publishing AG. Soft Computing was launched in 1997. Today, the journal is becoming twenty years old. Motivated by this anniversary, this article develops a bibliometric analysis of the journal in order to identify the leading trends of the journal in terms of publications and citations. The work considers several issues including the leading authors, institutions and countries. The study also uses a software to develop a graphical analysis. The results show a significant increase of the journal during the last years that has consolidated the journal as a leading one in the field.
Merigo, JM, Herrera-Viedma, E, Yager, RR & Kacprzyk, J 1970, 'A Bibliometric Overview of the Research Impact of Lotfi A. Zadeh', 2018 IEEE Symposium Series on Computational Intelligence (SSCI), 2018 IEEE Symposium Series on Computational Intelligence (SSCI), IEEE, pp. 441-446.
View/Download from: Publisher's site
View description>>
© 2018 IEEE. Lotfi A. Zadeh is the founder of fuzzy logic. He is one of the most prominent computer scientists of all-time. On the 6th of September of 2017 he passed away. In order to commemorate and provide a complete overview of his research impact in the scientific community, this study presents a bibliometric overview of his publications according to the results available in the Web of Science Core Collection. The article also uses the VOS viewer in order to map graphically the leading trends connected to Zadeh in terms of journals, papers, authors and countries. Obviously, the bibliometric sources used concern more recent works of Zadeh and one should bear in mind that his brilliant and prominent works on signal analysis, Z-transform, state space approach, optimal control, etc., are not included in our analyses.
Mirtalaie, MA, Hussain, OK, Chang, E & Hussain, FK 1970, 'Sentiment Analysis of Specific Product’s Features Using Product Tree for Application in New Product Development', Advances in Intelligent Networking and Collaborative Systems The 9th International Conference on Intelligent Networking and Collaborative Systems, International Conference on Intelligent Networking and Collaborative Systems, Springer International Publishing, Toronto, CANADA, pp. 82-95.
View/Download from: Publisher's site
View description>>
New Product Development (NPD) is a multi-step process by which novel products are introduced in the market. Sentiment analysis, which ascertains the popularity of each new feature added to the product, is one of the key steps in this process. In this paper we present an approach by which product designers analyze users’ reviews from social media platforms to determine the popularity of a specific product’s feature in order to make a decision about adding it to the product’s next generation. Our proposed approach utilizes a product tree generated from a product specification document to facilitate forming an efficient link between features mentioned in the users’ reviews and those of the product designer’s interest. Furthermore, it captures the links/interactions between a feature of interest and its other related features in a product to ascertain its polarity.
Pan, S, Hu, R, Long, G, Jiang, J, Yao, L & Zhang, C 1970, 'Adversarially Regularized Graph Autoencoder for Graph Embedding', IJCAI International Joint Conference on Artificial Intelligence, International Joint Conference on Artificial Intelligence, International Joint Conferences on Artificial Intelligence, Stockholm. Sweden, pp. 2609-2615.
View/Download from: Publisher's site
View description>>
Graph embedding is an effective method to represent graph data in a lowdimensional space for graph analytics. Most existing embedding algorithmstypically focus on preserving the topological structure or minimizing thereconstruction errors of graph data, but they have mostly ignored the datadistribution of the latent codes from the graphs, which often results ininferior embedding in real-world graph data. In this paper, we propose a noveladversarial graph embedding framework for graph data. The framework encodes thetopological structure and node content in a graph to a compact representation,on which a decoder is trained to reconstruct the graph structure. Furthermore,the latent representation is enforced to match a prior distribution via anadversarial training scheme. To learn a robust embedding, two variants ofadversarial approaches, adversarially regularized graph autoencoder (ARGA) andadversarially regularized variational graph autoencoder (ARVGA), are developed.Experimental studies on real-world graphs validate our design and demonstratethat our algorithms outperform baselines by a wide margin in link prediction,graph clustering, and graph visualization tasks.
Pang, G, Cao, L, Chen, L & Liu, H 1970, 'Learning Representations of Ultrahigh-dimensional Data for Random Distance-based Outlier Detection', Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD '18: The 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, London, United Kingdom, pp. 2041-2050.
View/Download from: Publisher's site
View description>>
© 2018 Association for Computing Machinery. Learning expressive low-dimensional representations of ultrahigh-dimensional data, e.g., data with thousands/millions of features, has been a major way to enable learning methods to address the curse of dimensionality. However, existing unsupervised representation learning methods mainly focus on preserving the data regularity information and learning the representations independently of subsequent outlier detection methods, which can result in suboptimal and unstable performance of detecting irregularities (i.e., outliers). This paper introduces a ranking model-based framework, called RAMODO, to address this issue. RAMODO unifies representation learning and outlier detection to learn low-dimensional representations that are tailored for a state-of-the-art outlier detection approach - the random distance-based approach. This customized learning yields more optimal and stable representations for the targeted outlier detectors. Additionally, RAMODO can leverage little labeled data as prior knowledge to learn more expressive and application-relevant representations. We instantiate RAMODO to an efficient method called REPEN to demonstrate the performance of RAMODO. Extensive empirical results on eight real-world ultrahigh dimensional data sets show that REPEN (i) enables a random distance-based detector to obtain significantly better AUC performance and two orders of magnitude speedup; (ii) performs substantially better and more stably than four state-of-the-art representation learning methods; and (iii) leverages less than 1% labeled data to achieve up to 32% AUC improvement.
Pang, G, Cao, L, Chen, L, Lian, D & Liu, H 1970, 'Sparse Modeling-Based Sequential Ensemble Learning for Effective Outlier Detection in High-Dimensional Numeric Data', Proceedings of the AAAI Conference on Artificial Intelligence, AAAI Conference on Artificial Intelligence, Association for the Advancement of Artificial Intelligence (AAAI), New Orleans, USA, pp. 3892-3899.
View/Download from: Publisher's site
View description>>
The large proportion of irrelevant or noisy features in real-life high-dimensional data presents a significant challenge to subspace/feature selection-based high-dimensional outlier detection (a.k.a. outlier scoring) methods. These methods often perform the two dependent tasks: relevant feature subset search and outlier scoring independently, consequently retaining features/subspaces irrelevant to the scoring method and downgrading the detection performance. This paper introduces a novel sequential ensemble-based framework SEMSE and its instance CINFO to address this issue. SEMSE learns the sequential ensembles to mutually refine feature selection and outlier scoring by iterative sparse modeling with outlier scores as the pseudo target feature. CINFO instantiates SEMSE by using three successive recurrent components to build such sequential ensembles. Given outlier scores output by an existing outlier scoring method on a feature subset, CINFO first defines a Cantelli's inequality-based outlier thresholding function to select outlier candidates with a false positive upper bound. It then performs lasso-based sparse regression by treating the outlier scores as the target feature and the original features as predictors on the outlier candidate set to obtain a feature subset that is tailored for the outlier scoring method. Our experiments show that two different outlier scoring methods enabled by CINFO (i) perform significantly better on 11 real-life high-dimensional data sets, and (ii) have much better resilience to noisy features, compared to their bare versions and three state-of-the-art competitors. The source code of CINFO is available at https://sites.google.com/site/gspangsite/sourcecode.
Poostchi, H & Piccardi, M 1970, 'Cluster Labeling by Word Embeddings and WordNet’s Hypernymy', https://www.aclweb.org/anthology/U18-1, Annual Workshop of The Australasian Language Technology Association, Dunedin, New Zealand.
View description>>
Cluster labeling is the assignment of representative labels to clusters of documents or words. Once assigned, the labels can play an important role in applications such as navigation, search and document classification. However, finding appropriately descriptive labels is still a challenging task. In this paper, we propose various approaches for assigning labels to word clusters by leveraging word embeddings and the synonymy and hypernymy relations in the WordNet lexical ontology. Experiments carried out using the WebAP document dataset have shown that one of the approaches stand out in the comparison and is capable of selecting labels that are reasonably aligned with those chosen by a pool of four human annotators.
Poostchi, H, Borzeshi, EZ & Piccardi, M 1970, 'BiLSTM-CRF for Persian Named-Entity Recognition', PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 11th International Conference on Language Resources and Evaluation (LREC), EUROPEAN LANGUAGE RESOURCES ASSOC-ELRA, JAPAN, Miyazaki, pp. 4427-4431.
Rahman, JS, Li, J, Xie, J, Fogelman, S & Blumenstein, M 1970, 'Connectivity Based Method for Clustering Microbial Communities from Metagenomics Data of Water and Soil Samples', 2018 International Joint Conference on Neural Networks (IJCNN), 2018 International Joint Conference on Neural Networks (IJCNN), IEEE, Rio de Janeiro, Brazil, pp. 1-8.
View/Download from: Publisher's site
View description>>
© 2018 IEEE. Understanding microbial community structure of metagenomics water and soil samples is a key process in discovering functions and impact of microorganisms on human and animal health. Evolution of Next Generation Sequencing (NGS) technology has encouraged researchers to sequence large quantity of microbial data from environmental sources. Clustering marker gene sequences into Operational Taxonomic Units (OTU) is the most significant task in microbial community analysis. Several methods have been developed over the years to improve OTU picking strategies. However, building strongly connected OTUs is a major issue in majority of these methods. Herein we present ConClust, a novel method for clustering OTUs that is based on quantifying connectivity among the sequences. Experimental analysis on two synthetic datasets and two real world datasets from water and soil samples demonstrate that our method can mine robust OTUs. Our method can be highly benelicial to study functions of known and unknown microbes and analyze their positive and negative effect on the environment as well as human and animal health.
Razzak, MI, Saris, RA, Blumenstein, M & Xu, G 1970, 'Robust 2D Joint Sparse Principal Component Analysis With F-Norm Minimization For Sparse Modelling: 2D-RJSPCA', 2018 International Joint Conference on Neural Networks (IJCNN), 2018 International Joint Conference on Neural Networks (IJCNN), IEEE, Rio de Janeiro, Brazil.
View/Download from: Publisher's site
Roberts, AGK, Catchpoole, DR & Kennedy, PJ 1970, 'Variance-based Feature Selection for Classification of Cancer Subtypes Using Gene Expression Data', 2018 International Joint Conference on Neural Networks (IJCNN), 2018 International Joint Conference on Neural Networks (IJCNN), IEEE, Rio de Janeiro, Brazil.
View/Download from: Publisher's site
View description>>
© 2018 IEEE. Classification in cancer has traditionally relied on feature selection by differential expression as a first step, where genes are selected according to the strength of evidence for a consistent difference in expression level between classes. However, recent work has shown that many genes also differ in the variance of their gene expression between disease states, and in particular between cancers of different types, prognosis, or stages of development. Features selected based on increased variance in cancer or differences in variance between tumours of differing prognosis have been used to successfully predict tumour progression or prognosis within the same cancer type, and to classify cancer subtypes in cases where there is an overall increase in variance in one class over the other. Here, we apply feature selection by differential variance to the more general problem of classification of cancer subtypes. We show that classifiers using features selected by differential variance are able to distinguish between clinically relevant cancer subtypes, that these classifiers perform as well as classifiers based on features selected by differential expression, and that combining the two approaches often gives better classification results than either feature selection method alone.
Saeed, Z, Abbasi, RA, Sadaf, A, Razzak, MI & Xu, G 1970, 'Text Stream to Temporal Network - A Dynamic Heartbeat Graph to Detect Emerging Events on Twitter', PAKDD 2018: Advances in Knowledge Discovery and Data Mining, Pacific-Asia Conference on Knowledge Discovery and Data, Springer International Publishing, Australia, pp. 534-545.
View/Download from: Publisher's site
View description>>
Huge mounds of data are generated every second on the Internet. People around the globe publish and share information related to real-world events they experience every day. This provides a valuable opportunity to analyze the content of this information to detect real-world happenings, however, it is quite challenging task. In this work, we propose a novel graph-based approach named the Dynamic Heartbeat Graph (DHG) that not only detects the events at an early stage, but also suppresses them in the upcoming adjacent data stream in order to highlight new emerging events. This characteristic makes the proposed method interesting and efficient in finding emerging events and related topics. The experiment results on real-world datasets (i.e. FA Cup Final and Super Tuesday 2012) show a considerable improvement in most cases, while time complexity remains very attractive.
Seifollahi, S, Piccardi, M & Borzeshi, EZ 1970, 'A Semi-supervised Hidden Markov Topic Model Based on Prior Knowledge', Communications in Computer and Information Science, Australasian Data Mining Conference, Springer Singapore, Melbourne, VIC, Australia,, pp. 265-276.
View/Download from: Publisher's site
View description>>
© Springer Nature Singapore Pte Ltd. 2018. A topic model is an unsupervised model to automatically discover the topics discussed in a collection of documents. Most of the existing topic models only use bag-of-words representations or single-word distributions and do not consider relations between words in the model. As a consequence, these models may generate topics which are not in good agreement with human-judged topic coherence. To mitigate this issue, we present a topic model which employs topically-related knowledge from prior topics and words’ co-occurrence/relations in the collection. To incorporate the prior knowledge, we leverage a two-staged semi-supervised Markov topic model. In the first stage, we estimate a transition matrix and a low-dimensional vocabulary for the final topic model. In the second stage, we produce the final topic model where the topic assignment is performed following a Markov chain process. Experiments on real text documents from a major compensation agency demonstrate improvements of both the PMI score measure and the topic coherence.
Shen, J, Wang, Y & Zhang, J 1970, 'Memory Optimized Deep Dense Network for Image Super-resolution', 2018 Digital Image Computing: Techniques and Applications (DICTA), 2018 Digital Image Computing: Techniques and Applications (DICTA), IEEE, Canberra, Australia.
View/Download from: Publisher's site
View description>>
CNN methods for image super-resolution consume a large number of training-time memory, due to the feature size will not decrease as the network goes deeper. To reduce the memory consumption during training, we propose a memory optimized deep dense network for image super-resolution. We first reduce redundant features learning, by rationally designing the skip connection and dense connection in the network. Then we adopt share memory allocations to store concatenated features and Batch Normalization intermediate feature maps. The memory optimized network consumes less memory than normal dense network. We also evaluate our proposed architecture on highly competitive super-resolution benchmark datasets. Our deep dense network outperforms some existing methods, and requires relatively less computation.
Shen, T, Zhou, T, Long, G, Jiang, J & Zhang, C 1970, 'Bi-Directional Block Self-Attention for Fast and Memory-Efficient Sequence Modeling', ICLR 2018.
View description>>
Recurrent neural networks (RNN), convolutional neural networks (CNN) andself-attention networks (SAN) are commonly used to produce context-awarerepresentations. RNN can capture long-range dependency but is hard toparallelize and not time-efficient. CNN focuses on local dependency but doesnot perform well on some tasks. SAN can model both such dependencies via highlyparallelizable computation, but memory requirement grows rapidly in line withsequence length. In this paper, we propose a model, called 'bi-directionalblock self-attention network (Bi-BloSAN)', for RNN/CNN-free sequence encoding.It requires as little memory as RNN but with all the merits of SAN. Bi-BloSANsplits the entire sequence into blocks, and applies an intra-block SAN to eachblock for modeling local context, then applies an inter-block SAN to theoutputs for all blocks to capture long-range dependency. Thus, each SAN onlyneeds to process a short sequence, and only a small amount of memory isrequired. Additionally, we use feature-level attention to handle the variationof contexts around the same word, and use forward/backward masks to encodetemporal order information. On nine benchmark datasets for different NLP tasks,Bi-BloSAN achieves or improves upon state-of-the-art accuracy, and shows betterefficiency-memory trade-off than existing RNN/CNN/SAN.
Shen, T, Zhou, T, Long, G, Jiang, J & Zhang, C 1970, 'Bi-directional block self-attention for fast and memory-efficient sequence modeling', 6th International Conference on Learning Representations, ICLR 2018 - Conference Track Proceedings, International Conference on Representation Learning, Vancouver CANADA.
View description>>
Recurrent neural networks (RNN), convolutional neural networks (CNN) and self-attention networks (SAN) are commonly used to produce context-aware representations. RNN can capture long-range dependency but is hard to parallelize and not time-efficient. CNN focuses on local dependency but does not perform well on some tasks. SAN can model both such dependencies via highly parallelizable computation, but memory requirement grows rapidly in line with sequence length. In this paper, we propose a model, called “bi-directional block self-attention network (Bi-BloSAN)”, for RNN/CNN-free sequence encoding. It requires as little memory as RNN but with all the merits of SAN. Bi-BloSAN splits the entire sequence into blocks, and applies an intra-block SAN to each block for modeling local context, then applies an inter-block SAN to the outputs for all blocks to capture long-range dependency. Thus, each SAN only needs to process a short sequence, and only a small amount of memory is required. Additionally, we use feature-level attention to handle the variation of contexts around the same word, and use forward/backward masks to encode temporal order information. On nine benchmark datasets for different NLP tasks, Bi-BloSAN achieves or improves upon state-of-the-art accuracy, and shows better efficiency-memory trade-off than existing RNN/CNN/SAN.
Shen, T, Zhou, T, Long, G, Jiang, J & Zhang, C 1970, 'Tensorized Self-Attention: Efficiently Modeling Pairwise and Global Dependencies Together', 2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics(NAACL-2019).
View description>>
Neural networks equipped with self-attention have parallelizable computation,light-weight structure, and the ability to capture both long-range and localdependencies. Further, their expressive power and performance can be boosted byusing a vector to measure pairwise dependency, but this requires to expand thealignment matrix to a tensor, which results in memory and computationbottlenecks. In this paper, we propose a novel attention mechanism called'Multi-mask Tensorized Self-Attention' (MTSA), which is as fast and asmemory-efficient as a CNN, but significantly outperforms previousCNN-/RNN-/attention-based models. MTSA 1) captures both pairwise (token2token)and global (source2token) dependencies by a novel compatibility functioncomposed of dot-product and additive attentions, 2) uses a tensor to representthe feature-wise alignment scores for better expressive power but only requiresparallelizable matrix multiplications, and 3) combines multi-head withmulti-dimensional attentions, and applies a distinct positional mask to eachhead (subspace), so the memory and computation can be distributed to multipleheads, each with sequential information encoded independently. The experimentsshow that a CNN/RNN-free model based on MTSA achieves state-of-the-art orcompetitive performance on nine NLP benchmarks with compelling memory- andtime-efficiency.
Shen, T, Zhou, T, Long, G, Jiang, J, Wang, S & Zhang, C 1970, 'Reinforced Self-Attention Network: a Hybrid of Hard and Soft Attention for Sequence Modeling', 27th International Joint Conference on Artificial Intelligence and the 23rd European Conference on Artificial Intelligence, International Joint Conference on Artificial Intelligence, International Joint Conference on Artificial Intelligence, Stockholm, Sweden, pp. 1-8.
View/Download from: Publisher's site
View description>>
Many natural language processing tasks solely rely on sparse dependenciesbetween a few tokens in a sentence. Soft attention mechanisms show promisingperformance in modeling local/global dependencies by soft probabilities betweenevery two tokens, but they are not effective and efficient when applied to longsentences. By contrast, hard attention mechanisms directly select a subset oftokens but are difficult and inefficient to train due to their combinatorialnature. In this paper, we integrate both soft and hard attention into onecontext fusion model, 'reinforced self-attention (ReSA)', for the mutualbenefit of each other. In ReSA, a hard attention trims a sequence for a softself-attention to process, while the soft attention feeds reward signals backto facilitate the training of the hard one. For this purpose, we develop anovel hard attention called 'reinforced sequence sampling (RSS)', selectingtokens in parallel and trained via policy gradient. Using two RSS modules, ReSAefficiently extracts the sparse dependencies between each pair of selectedtokens. We finally propose an RNN/CNN-free sentence-encoding model, 'reinforcedself-attention network (ReSAN)', solely based on ReSA. It achievesstate-of-the-art performance on both Stanford Natural Language Inference (SNLI)and Sentences Involving Compositional Knowledge (SICK) datasets.
Shi, H, He, W & Xu, G 1970, 'Workshop Proposal on Knowledge Discovery from Digital Libraries', Proceedings of the 18th ACM/IEEE on Joint Conference on Digital Libraries, JCDL '18: The 18th ACM/IEEE Joint Conference on Digital Libraries, ACM, Texas, USA, pp. 429-430.
View/Download from: Publisher's site
View description>>
© 2018 Authors. The workshop is with the ACM/IEEE Joint Conference on Digital Libraries in 2018 (JCDL 2018) which will be held in Fort Worth, Texas, USA on June 3 - 7, 2018. The Joint Conference on Digital Libraries (JCDL) is a major international forum focusing on digital libraries and associated technical, practical, and social issues.
Shi, Z, Zhang, JA, Xu, R & Fang, G 1970, 'Human Activity Recognition Using Deep Learning Networks with Enhanced Channel State Information', 2018 IEEE Globecom Workshops (GC Wkshps), 2018 IEEE Globecom Workshops (GC Wkshps), IEEE, Abu Dhabi, United Arab Emirates.
View/Download from: Publisher's site
View description>>
© 2018 IEEE. Channel State Information (CSI) is widely used for device free human activity recognition. Feature extraction remains as one of the most challenging tasks in a dynamic and complex environment. In this paper, we propose a human activity recognition scheme using Deep Learning Networks with enhanced Channel State information (DLN-eCSI). We develop a CSI feature enhancement scheme (CFES), including two modules of background reduction and correlation feature enhancement, for preprocessing the data input to the DLN. After cleaning and compressing the signals using CFES, we apply the recurrent neural networking (RNN) to automatically extract deeper features and then the softmax regression algorithm for activity classification. Extensive experiments are conducted to validate the effectiveness of the proposed scheme.
Thac Do, TD & Cao, L 1970, 'Gamma-Poisson dynamic matrix factorization embedded with metadata influence', Advances in Neural Information Processing Systems, Advances in Neural Information Processing Systems, Neural Information Processing Systems Foundation, Montreal, Canada, pp. 5824-5835.
View description>>
A conjugate Gamma-Poisson model for Dynamic Matrix Factorization incorporated with metadata influence (mGDMF for short) is proposed to effectively and efficiently model massive, sparse and dynamic data in recommendations. Modeling recommendation problems with a massive number of ratings and very sparse or even no ratings on some users/items in a dynamic setting is very demanding and poses critical challenges to well-studied matrix factorization models due to the large-scale, sparse and dynamic nature of the data. Our proposed mGDMF tackles these challenges by introducing three strategies: (1) constructing a stable Gamma-Markov chain model that smoothly drifts over time by combining both static and dynamic latent features of data; (2) incorporating the user/item metadata into the model to tackle sparse ratings; and (3) undertaking stochastic variational inference to efficiently handle massive data. mGDMF is conjugate, dynamic and scalable. Experiments show that mGDMF significantly (both effectively and efficiently) outperforms the state-of-the-art static and dynamic models on large, sparse and dynamic data.
Tofigh, F, Mao, G, Lipman, J & Abolhasan, M 1970, 'Crowd Density Mapping Based on Wi-Fi Measurements on Train Platforms', 2018 12th International Conference on Signal Processing and Communication Systems (ICSPCS), 2018 12th International Conference on Signal Processing and Communication Systems (ICSPCS), IEEE, Cairns, Australia.
View/Download from: Publisher's site
View description>>
© 2018 IEEE. Crowd distribution is a challenging issue in the management and design levels. This paper provides a passive method to derive the crowd density distribution using Wi-Fi measurements on a real scenario. Six WiFi access points (AP) are deployed in the platform 2/3 of Redfern station, Sydney to monitor the platform for a week. Based on the probability maps that are built using RSSI measurements and prior knowledge, the crowd distribution is calculated on the platform and its results are compared with distributions acquired from CCTV images. Final density heat maps are in good agreement with the acquired results from CCTV cameras.
Verma, R, Merigo, JM & Mittal, N 1970, 'Triangular Fuzzy Partitioned Bonferroni Mean Operators and Their Application to Multiple Attribute Decision Making', 2018 IEEE Symposium Series on Computational Intelligence (SSCI), 2018 IEEE Symposium Series on Computational Intelligence (SSCI), IEEE, pp. 941-949.
View/Download from: Publisher's site
View description>>
© 2018 IEEE. The Bonferroni mean (BM) operator, introduced by Bonferroni, is a powerful tool to capture the interrelationship among aggregated arguments. Various generalizations and extensions of BM have developed and applied to solve many realworld problems. Recently, the notion of Partitioned Bonferroni mean (PBM) operator has been proposed with the assumption that the interrelationships do not always exist among all of the attributes. This work studies the PBM operator under triangular fuzzy environment. First, we propose a new fuzzy aggregation operator called the triangular fuzzy partitioned Bonferroni mean} (TFPBM) operator for aggregating triangular fuzzy numbers. Some properties and special cases of the new aggregation operator are also investigated. For the situations where the input arguments have different importance, we then define the triangular fuzzy weighted partitioned Bonferroni mean} (TFWPBM) operator. Furthermore, based on TFWPBM operator, an approach to deal with multiple attribute decision-making problems under triangular fuzzy environment is developed. Finally, a practical example is provided to illustrate the developed approach.
Verma, R, Merigo, JM & Sahni, M 1970, 'On Generalized Fuzzy Jensen-Exponential Divergence and Its Application to Pattern Recognition', 2018 IEEE Symposium Series on Computational Intelligence (SSCI), 2018 IEEE Symposium Series on Computational Intelligence (SSCI), IEEE, pp. 1515-1519.
View/Download from: Publisher's site
View description>>
© 2018 IEEE. This paper develops a novel information theoretic divergence measure between two fuzzy sets based on exponential function and applies it to solve pattern recognition problems. First, we generalize the idea of fuzzy Jensen-exponential divergence and propose a new parametric divergence called fuzzy Jensen-exponential divergence of order-α to measure the information of discrimination between two fuzzy sets. We also prove some properties of the proposed measure and discuss its particular cases. Finally, we apply the proposed divergence measure between fuzzy sets to deal with pattern recognition problems with fuzzy information.
Vishwa, A & Hussain, FK 1970, 'A Blockchain based approach for multimedia privacy protection and provenance', 2018 IEEE Symposium Series on Computational Intelligence (SSCI), 2018 IEEE Symposium Series on Computational Intelligence (SSCI), IEEE, Bangalore, India, India, pp. 1941-1945.
View/Download from: Publisher's site
View description>>
© 2018 IEEE. There has been a vast increase in incidents related to multimedia copyright and security breaches in the past few years, compromising users' privacy. One such breach involved the seventh season of the TV series 'Game of Thrones', where episodes were illegally downloaded before the official release date etc. Such security breaches raise questions about the approaches and models that currently apply to data privacy and security, where the user saves and distributes his data personally or depends on a third party or stakeholder to manage the distribution rights of sensitive data. When it comes to multimedia, many companies or multimedia owners rely on third parties, distributors and sales persons to monitor their publicity, maintain their popularity and sell their multimedia content. Blockchain technology, which was originally devised for the digital currency (cryptocurrency), has distinct features such as distributed networking, data privacy, trust less computing etc. This technology attracts great interest from the research community due to its innovative properties which can be applied to many business applications, one being access control over data. In this paper, we present a decentralized data management framework that ensures user data privacy and control. We propose a protocol that uses blockchain technology to take control of the user's data. This protocol enables the user to have full control over his multimedia files and he doesn't need to trust a third party. The framework allows the user to not only store data but also to query and share data as well as auditing. Finally, we discuss possible future extensions of blockchain technology as a medium to ensure privacy, data control, auditing and trust management in different areas.
Wahid -Ul- Ashraf, A, Budka, M & Musial-Gabrys, K 1970, 'Newton’s Gravitational Law for Link Prediction in Social Networks', Complex Networks & Their Applications VI Proceedings of Complex Networks 2017 (The Sixth International Conference on Complex Networks and Their Applications) (SCI 689), International Conference on Complex Networks and their Applications, Springer International Publishing, Lyon, France, pp. 93-104.
View/Download from: Publisher's site
View description>>
Link prediction is an important research area in network science due to a wide range of real-world application. There are a number of link prediction methods. In the area of social networks, these methods are mostly inspired by social theory, such as having more mutual friends between two people in a social network platform entails higher probability of those two people becoming friends in the future. In this paper we take our inspiration from a different area, which is Newton’s law of universal gravitation. Although this law deals with physical bodies, based on our intuition and empirical results we found that this could also work in networks, and especially in social networks. In order to apply this law, we had to endow nodes with the notion of mass and distance. While node importance could be considered as mass, the shortest path, path count, or inverse similarity (AdamicAdar, Katz score etc.) could be considered as distance. In our analysis, we have primarily used degree centrality to denote the mass of the nodes, while the lengths of shortest paths between them have been used as distances. In this study we compare the proposed link prediction approach to 7 other methods on 4 datasets from various domains. To this end, we use the ROC curves and the AUC measure to compare the methods. As the results show that our approach outperforms the other 7 methods on 2 out of the 4 datasets, we also discuss the potential reasons of the observed behaviour.
Wahid-Ul-Ashraf, A, Budka, M & Musial, K 1970, 'NetSim -- The framework for complex network generator', Procedia Computer Science, Knowledge-Based and Intelligent Information & Engineering Systems, Elsevier, Belgrade, Serbia, pp. 547-556.
View/Download from: Publisher's site
View description>>
Networks are everywhere and their many types, including social networks, theInternet, food webs etc., have been studied for the last few decades. However,in real-world networks, it's hard to find examples that can be easilycomparable, i.e. have the same density or even number of nodes and edges. Wepropose a flexible and extensible NetSim framework to understand how propertiesin different types of networks change with varying number of edges andvertices. Our approach enables to simulate three classical network models(random, small-world and scale-free) with easily adjustable model parametersand network size. To be able to compare different networks, for a singleexperimental setup we kept the number of edges and vertices fixed across themodels. To understand how they change depending on the number of nodes andedges we ran over 30,000 simulations and analysed different networkcharacteristics that cannot be derived analytically. Two of the main findingsfrom the analysis are that the average shortest path does not change with thedensity of the scale-free network but changes for small-world and randomnetworks; the apparent difference in mean betweenness centrality of thescale-free network compared with random and small-world networks.
Wan, Y, Zhao, Z, Yang, M, Xu, G, Ying, H, Wu, J & Yu, PS 1970, 'Improving automatic source code summarization via deep reinforcement learning', Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, ASE '18: 33rd ACM/IEEE International Conference on Automated Software Engineering, ACM, Corum, Montpellier, France, pp. 397-407.
View/Download from: Publisher's site
View description>>
© 2018 Association for Computing Machinery. Code summarization provides a high level natural language description of the function performed by code, as it can benefit the software maintenance, code categorization and retrieval. To the best of our knowledge, most state-of-the-art approaches follow an encoder-decoder framework which encodes the code into a hidden space and then decode it into natural language space, suffering from two major drawbacks: a) Their encoders only consider the sequential content of code, ignoring the tree structure which is also critical for the task of code summarization; b) Their decoders are typically trained to predict the next word by maximizing the likelihood of next ground-truth word with previous ground-truth word given. However, it is expected to generate the entire sequence from scratch at test time. This discrepancy can cause an exposure bias issue, making the learnt decoder suboptimal. In this paper, we incorporate an abstract syntax tree structure as well as sequential content of code snippets into a deep reinforcement learning framework (i.e., actor-critic network). The actor network provides the confidence of predicting the next word according to current state. On the other hand, the critic network evaluates the reward value of all possible extensions of the current state and can provide global guidance for explorations. We employ an advantage reward composed of BLEU metric to train both networks. Comprehensive experiments on a real-world dataset show the effectiveness of our proposed model when compared with some state-of-the-art methods.
Wang, J, Chen, L, Qin, L & Wu, X 1970, 'ASTM: An Attentional Segmentation Based Topic Model for Short Texts.', ICDM, IEEE International Conference on Data Mining, IEEE Computer Society, Singapore, Singapore, pp. 577-586.
View/Download from: Publisher's site
View description>>
© 2018 IEEE. To address the data sparsity problem in short text understanding, various alternative topic models leveraging word embeddings as background knowledge have been developed recently. However, existing models combine auxiliary information and topic modeling in a straightforward way without considering human reading habits. In contrast, extensive studies have proven that it is full of potential in textual analysis by taking into account human attention. Therefore, we propose a novel model, Attentional Segmentation based Topic Model (ASTM), to integrate both word embeddings as supplementary information and an attention mechanism that segments short text documents into fragments of adjacent words receiving similar attention. Each segment is assigned to a topic and each document can have multiple topics. We evaluate the performance of our model on three real-world short text datasets. The experimental results demonstrate that our model outperforms the state-of-the-art in terms of both topic coherence and text classification.
Wang, L, Bao, X & Cao, L 1970, 'Interactive Probabilistic Post-Mining of User-Preferred Spatial Co-Location Patterns', 2018 IEEE 34th International Conference on Data Engineering (ICDE), 2018 IEEE 34th International Conference on Data Engineering (ICDE), IEEE, Paris, France, pp. 1260-1263.
View/Download from: Publisher's site
View description>>
© 2018 IEEE. Spatial co-location pattern mining is an important task in spatial data mining. However, traditional mining frameworks often produce too many prevalent patterns of which only a small proportion may be truly interesting to end users. To satisfy user preferences, this work proposes an interactive probabilistic post-mining method to discover user-preferred co-location patterns from the early-round of mined results by iteratively involving user's feedback and probabilistically refining preferred patterns. We first introduce a framework of interactively post-mining preferred co-location patterns, which enables a user to effectively discover the co-location patterns tailored to his/her specific preference. A probabilistic model is further introduced to measure the user feedback-based subjective preferences on resultant co-location patterns. This measure is used to not only select sample co-location patterns in the iterative user feedback process but also rank the results. The experimental results on real and synthetic data sets demonstrate the effectiveness of our approach.
Wang, S, Hu, L, Cao, L, Huang, X, Lian, D & Liu, W 1970, 'Attention-based transactional context embedding for next-item recommendation', 32nd AAAI Conference on Artificial Intelligence, AAAI 2018, AAAI Conference on Artificial Intelligence, AAAI, New Orleans, United States, pp. 2532-2539.
View description>>
To recommend the next item to a user in a transactional context is practical yet challenging in applications such as marketing campaigns. Transactional context refers to the items that are observable in a transaction. Most existing transaction-based recommender systems (TBRSs) make recommendations by mainly considering recently occurring items instead of all the ones observed in the current context. Moreover, they often assume a rigid order between items within a transaction, which is not always practical. More importantly, a long transaction often contains many items irreverent to the next choice, which tends to overwhelm the influence of a few truely relevant ones. Therefore, we posit that a good TBRS should not only consider all the observed items in the current transaction but also weight them with different relevance to build an attentive context that outputs the proper next item with a high probability. To this end, we design an effective attention-based transaction embedding model (ATEM) for context embedding to weight each observed item in a transaction without assuming order. The empirical study on real-world transaction datasets proves that ATEM significantly outperforms the state-of-the-art methods in terms of both accuracy and novelty.
Wang, Y, Shen, J & Zhang, J 1970, 'Deep Bi-Dense Networks for Image Super-Resolution', 2018 INTERNATIONAL CONFERENCE ON DIGITAL IMAGE COMPUTING: TECHNIQUES AND APPLICATIONS (DICTA), International Conference on Digital Image Computing - Techniques and Applications (DICTA), IEEE, AUSTRALIA, Canberra, pp. 404-411.
Wang, Y, Shen, J & Zhang, J 1970, 'Deep Bi-Dense Networks for Image Super-Resolution', 2018 Digital Image Computing: Techniques and Applications (DICTA), 2018 Digital Image Computing: Techniques and Applications (DICTA), IEEE, Canberra.
View/Download from: Publisher's site
View description>>
This paper proposes Deep Bi-Dense Networks (DBD-N) for single image super-resolution. Our approach extends previous intra-block dense connection approaches by including novel inter-block dense connections. In this way, feature information propagates from a single dense block to all subsequent blocks, instead of to a single successor. To build a DBDN, we firstly construct intra-dense blocks, which extract and compress abundant local features via densely connected convolutional layers and compression layers for further feature learning. Then, we use an inter-block dense net to connect intra-dense blocks, which allow each intra-dense block propagates its own local features to all successors. Additionally, our bi-dense construction connects each block to the output, alleviating the vanishing gradient problems in training. The evaluation of our proposed method on five benchmark data sets shows that our DBDN outperforms the state of the art in SISR with a moderate number of network parameters.
Wu, D, Lu, J, Hussain, F, Doumouras, C & Zhang, G 1970, 'A workforce health insurance plan recommender system', Data Science and Knowledge Engineering for Sensing Decision Support, Conference on Data Science and Knowledge Engineering for Sensing Decision Support (FLINS 2018), WORLD SCIENTIFIC.
View/Download from: Publisher's site
Wu, W, Li, B, Chen, L & Zhang, C 1970, 'Efficient Attributed Network Embedding via Recursive Randomized Hashing', Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, Twenty-Seventh International Joint Conference on Artificial Intelligence {IJCAI-18}, International Joint Conferences on Artificial Intelligence Organization, Stockholm, Sweden, pp. 2861-2867.
View/Download from: Publisher's site
View description>>
Attributed network embedding aims to learn a low-dimensional representation for each node of a network, considering both attributes and structure information of the node. However, the learning based methods usually involve substantial cost in time, which makes them impractical without the help of a powerful workhorse. In this paper, we propose a simple yet effective algorithm, named NetHash, to solve this problem only with moderate computing capacity. NetHash employs the randomized hashing technique to encode shallow trees, each of which is rooted at a node of the network. The main idea is to efficiently encode both attributes and structure information of each node by recursively sketching the corresponding rooted tree from bottom (i.e., the predefined highest-order neighboring nodes) to top (i.e., the root node), and particularly, to preserve as much information closer to the root node as possible. Our extensive experimental results show that the proposed algorithm, which does not need learning, runs significantly faster than the state-of-the-art learning-based network embedding methods while achieving competitive or even better performance in accuracy.
Xu, J & Cao, L 1970, 'Vine Copula-Based Asymmetry and Tail Dependence Modeling', Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Pacific-Asia Conference on Knowledge Discovery and Data Mining, Springer International Publishing, Melbourne, VIC, Australia, pp. 285-297.
View/Download from: Publisher's site
View description>>
© Springer International Publishing AG, part of Springer Nature 2018. Financial variables such as asset returns in the massive market contain various hierarchical and horizontal relationships that form complicated dependence structures. Modeling these structures is challenging due to the stylized facts of market data. Many research works in recent decades showed that copula is an effective method to describe relations among variables. Vine structures were introduced to represent the decomposition of multivariate copula functions. However, the model construction of vine structures is still a tough problem owing to the geometrical data, conditional independent assumptions and the stylized facts. In this paper, we introduce a new bottom-to-up method to construct regular vine structures and applies the model to 12 currencies over 16 years as a case study to analyze the asymmetric and fat tail features. The out-of-sample performance of our model is evaluated by Value at Risk, a widely used industrial benchmark. The experimental results show that our model and its intrinsic design significantly outperform industry baselines, and provide financially interpretable knowledge and profound insights into the dependence structures of multi-variables with complex dependencies and characteristics.
Yang, E, Deng, C, Liu, T, Liu, W & Tao, D 1970, 'Semantic Structure-based Unsupervised Deep Hashing', Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, Twenty-Seventh International Joint Conference on Artificial Intelligence {IJCAI-18}, International Joint Conferences on Artificial Intelligence Organization, pp. 1064-1070.
View/Download from: Publisher's site
View description>>
Hashing is becoming increasingly popular for approximate nearest neighbor searching in massive databases due to its storage and search efficiency. Recent supervised hashing methods, which usually construct semantic similarity matrices to guide hash code learning using label information, have shown promising results. However, it is relatively difficult to capture and utilize the semantic relationships between points in unsupervised settings. To address this problem, we propose a novel unsupervised deep framework called Semantic Structure-based unsupervised Deep Hashing (SSDH). We first empirically study the deep feature statistics, and find that the distribution of the cosine distance for point pairs can be estimated by two half Gaussian distributions. Based on this observation, we construct the semantic structure by considering points with distances obviously smaller than the others as semantically similar and points with distances obviously larger than the others as semantically dissimilar. We then design a deep architecture and a pair-wise loss function to preserve this semantic structure in Hamming space. Extensive experiments show that SSDH significantly outperforms current state-of-the-art methods.
Yang, H, Pan, S, Zhang, P, Chen, L, Lian, D & Zhang, C 1970, 'Binarized attributed network embedding', 2018 IEEE International Conference on Data Mining (ICDM), 2018 IEEE International Conference on Data Mining (ICDM), IEEE, Singapore, Singapore, pp. 1476-1481.
View/Download from: Publisher's site
View description>>
© 2018 IEEE. Attributed network embedding enables joint representation learning of node links and attributes. Existing attributed network embedding models are designed in continuous Euclidean spaces which often introduce data redundancy and impose challenges to storage and computation costs. To this end, we present a Binarized Attributed Network Embedding model (BANE for short) to learn binary node representation. Specifically, we define a new Weisfeiler-Lehman proximity matrix to capture data dependence between node links and attributes by aggregating the information of node attributes and links from neighboring nodes to a given target node in a layer-wise manner. Based on the Weisfeiler-Lehman proximity matrix, we formulate a new Weisfiler-Lehman matrix factorization learning function under the binary node representation constraint. The learning problem is a mixed integer optimization and an efficient cyclic coordinate descent (CCD) algorithm is used as the solution. Node classification and link prediction experiments on real-world datasets show that the proposed BANE model outperforms the state-of-the-art network embedding methods.
Yang, H, Pan, S, Zhang, P, Chen, L, Lian, D & Zhang, C 1970, 'Binarized Attributed Network Embedding', 2018 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 18th IEEE International Conference on Data Mining Workshops (ICDMW), IEEE, SINGAPORE, Singapore, pp. 1476-1481.
View/Download from: Publisher's site
Yao, L, Kusakunniran, W, Wu, Q, Zhang, J & Tang, Z 1970, 'Robust CNN-based Gait Verification and Identification using Skeleton Gait Energy Image', 2018 Digital Image Computing: Techniques and Applications (DICTA), 2018 Digital Image Computing: Techniques and Applications (DICTA), IEEE, Canberra, Australia.
View/Download from: Publisher's site
View description>>
© 2018 IEEE. As a kind of behavioral biometrie feature, gait has been widely applied for human verification and identification. Approaches to gait recognition can be classified into two categories: model-free approaches and model-based approaches. Model-free approaches are sensitive to appearance changes. For model-based approaches, it is difficult to extract the reliable body models from gait sequences. In this paper, based on the robust skeleton points produced from a two-branch multi-stage CNN network, a novel model-based feature, Skeleton Gait Energy Image (SGEI), has been proposed. Relevant experimental performances indicate that SGEI is more robust to the cloth changes. Another contribution is that two different CNN-based architectures have been separately proposed for gait verification and gait identification. Both these two architectures have been evaluated on the datasets. They have presented satisfying performances and increased the robustness for gait recognition in the unconstrained environments with view variances and cloth variances.
Yao, Y, Zhang, J, Shen, F, Yang, W, Hua, X-S & Tang, Z 1970, 'Extracting Privileged Information from Untagged Corpora for Classifier Learning', Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, Twenty-Seventh International Joint Conference on Artificial Intelligence {IJCAI-18}, International Joint Conferences on Artificial Intelligence Organization, Stockholm, Sweden, pp. 1085-1091.
View/Download from: Publisher's site
View description>>
The performance of data-driven learning approaches is often unsatisfactory when the training data is inadequate either in quantity or quality. Manually labeled privileged information (PI), \eg attributes, tags or properties, is usually incorporated to improve classifier learning. However, the process of manually labeling is time-consuming and labor-intensive. To address this issue, we propose to enhance classifier learning by extracting PI from untagged corpora, which can effectively eliminate the dependency on manually labeled data. In detail, we treat each selected PI as a subcategory and learn one classifier for per subcategory independently. The classifiers for all subcategories are then integrated together to form a more powerful category classifier. Particularly, we propose a new instance-level multi-instance learning (MIL) model to simultaneously select a subset of training images from each subcategory and learn the optimal classifiers based on the selected images. Extensive experiments demonstrate the superiority of our approach.
Yao, Y, Zhang, J, Shen, F, Yang, W, Huang, P & Tang, Z 1970, 'Discovering and distinguishing multiple visual senses for polysemous words', 32nd AAAI Conference on Artificial Intelligence, AAAI 2018, AAAI Conference on Artificial Intelligence, The AAAI Press, New Orleans, USA, pp. 523-530.
View description>>
To reduce the dependence on labeled data, there have been increasing research efforts on learning visual classifiers by exploiting web images. One issue that limits their performance is the problem of polysemy. To solve this problem, in this work, we present a novel framework that solves the problem of polysemy by allowing sense-specific diversity in search results. Specifically, we first discover a list of possible semantic senses to retrieve sense-specific images. Then we merge visual similar semantic senses and prune noises by using the retrieved images. Finally, we train a visual classifier for each selected semantic sense and use the learned sense-specific classifiers to distinguish multiple visual senses. Extensive experiments on classifying images into sense-specific categories and re-ranking search results demonstrate the superiority of our proposed approach.
Ying, H, Zhuang, F, Zhang, F, Liu, Y, Xu, G, Xie, X, Xiong, H & Wu, J 1970, 'Sequential Recommender System based on Hierarchical Attention Networks', Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, Twenty-Seventh International Joint Conference on Artificial Intelligence {IJCAI-18}, International Joint Conferences on Artificial Intelligence Organization, Stockholm, Sweden, pp. 3926-3932.
View/Download from: Publisher's site
View description>>
With a large amount of user activity data accumulated, it is crucial to exploit user sequential behavior for sequential recommendations. Conventionally, user general taste and recent demand are combined to promote recommendation performances. However, existing methods often neglect that user long-term preference keep evolving over time, and building a static representation for user general taste may not adequately reflect the dynamic characters. Moreover, they integrate user-item or item-item interactions through a linear way which limits the capability of model. To this end, in this paper, we propose a novel two-layer hierarchical attention network, which takes the above properties into account, to recommend the next item user might be interested. Specifically, the first attention layer learns user long-term preferences based on the historical purchased item representation, while the second one outputs final user representation through coupling user long-term and short-term preferences. The experimental study demonstrates the superiority of our method compared with other state-of-the-art ones.
Yusoff, B, Merigó, JM & Hornero, DC 1970, 'Analysis on Extensions of Multi-expert Decision Making Model with Respect to OWA-Based Aggregation Processes', Advances in Intelligent Systems and Computing, International Forum for Interdisciplinary Mathematics, Springer International Publishing, Palau Macaya, Barcelona, Spain, pp. 179-196.
View/Download from: Publisher's site
View description>>
© Springer International Publishing AG, part of Springer Nature 2018. In this paper, an analysis on extensions of multi-expert decision making model based on ordered weighted averaging (OWA) operators is presented. The focus is on the aggregation of criteria and the aggregation of individual judgment of experts. First, soft majority concept based on induced OWA (IOWA) and generalized quantifiers to aggregate the experts’ judgments is analyzed, in which concentrated on both classical and alternative schemes of decision making model. Secondly, analysis on the weighting methods related to unification of weighted average (WA) and OWA is conducted. An alternative weighting technique is proposed which is termed as alternative OWA-WA (AOWAWA) operator. The multi-expert decision making model then is developed based on both aggregation processes and a comparison is made to see the effect of different schemes for the fusion of soft majority opinions of experts and distinct weighting techniques in aggregating the criteria. A numerical example in the selection of investment strategy is provided for the comparison purpose.
Yusoff, B, Merigó, JM & Hornero, DC 1970, 'Generalized OWA-TOPSIS Model Based on the Concept of Majority Opinion for Group Decision Making', Advances in Intelligent Systems and Computing, International Conference of the ‘Forum for Interdisciplinary Mathematics, Springer International Publishing, Spain, pp. 124-139.
View/Download from: Publisher's site
View description>>
© Springer International Publishing AG, part of Springer Nature 2018. In this paper, an extension of OWA-TOPSIS model by inclusion of a concept of majority opinion and generalized aggregation operators for group decision making is proposed. To achieve this objective, two fusion schemes in TOPSIS model are designed. First, an external fusion scheme to aggregate the experts’ judgments with respect to the concept of majority opinion on each criterion is proposed. Then, an internal fusion scheme of ideal and anti-ideal solutions that represents the majority of experts is proposed using the Minkowski OWA distance with the inclusion of relative importances of criteria. The advantages of the proposed model include, a consideration of soft majority concept as a group aggregator and a flexibility in applying the decision strategies for analyzing the decision making process. In addition, instead of calculate the majority opinion with respect to the individual experts’ judgments on each alternative, the proposed method takes into account the majority of experts on each criterion, in which reflects the specificity on criteria for overall decision. A numerical example is provided to demonstrate the applicability of the proposed method and comparisons are made between some aggregation operators and distance measures.
Zhang, J, Wu, Q, Shen, C, Zhang, J, Lu, J & van den Hengel, A 1970, 'Goal-Oriented Visual Question Generation via Intermediate Rewards', Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), European Conference on Computer Vision, Springer International Publishing, Munich, Germany, pp. 189-204.
View/Download from: Publisher's site
View description>>
© 2018, Springer Nature Switzerland AG. Despite significant progress in a variety of vision-and-language problems, developing a method capable of asking intelligent, goal-oriented questions about images is proven to be an inscrutable challenge. Towards this end, we propose a Deep Reinforcement Learning framework based on three new intermediate rewards, namely goal-achieved, progressive and informativeness that encourage the generation of succinct questions, which in turn uncover valuable information towards the overall goal. By directly optimizing for questions that work quickly towards fulfilling the overall goal, we avoid the tendency of existing methods to generate long series of inane queries that add little value. We evaluate our model on the GuessWhat?! dataset and show that the resulting questions can help a standard ‘Guesser’ identify a specific object in an image at a much higher success rate.
Zhang, L, Cao, L, Luo, S, Gu, L, Chen, Y & Lian, Y 1970, 'Coupled Collective Matrix Factorization', 2018 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computing, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI), 2018 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computing, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI), IEEE, Guangzhou, China, pp. 1023-1030.
View/Download from: Publisher's site
View description>>
© 2018 IEEE. Collective Matrix Factorization (CMF) makes rating prediction by jointly factorizing multiple matrices in recommender systems (RS), which also provides a unified view of matrix factorization. However, CMF does not directly involve the user attributes and item attributes that represent the intrinsic characteristics of users and items, so it fails to capture the coupling relationships within and between entities, such as users and items, which represent low-level data characteristics and complexities and drive the rating dynamics. In this work, we propose a coupled CMF (CCMF), which not only accommodates entity attributes into rating prediction, but also incorporates the couplings within and between entities into CMF. Therefore, CCMF not only captures the latent variable-based relationships between ratings and specific dimensions at high levels, but also captures the underlying driving forces, i.e., the hierarchical couplings within and between entities representing the low-level data characteristics and complexities. This work also presents a unified framework of CCMF in RS. Experimental results on two real data sets show that our proposed model outperforms the MF-based approaches.
Zhang, L, Xu, J, Zhang, J & Gong, Y 1970, 'Information Enhancement for Travelogues via a Hybrid Clustering Model', 2018 Digital Image Computing: Techniques and Applications (DICTA), 2018 Digital Image Computing: Techniques and Applications (DICTA), IEEE, Canberra, ACT, Australia, pp. 1-8.
View/Download from: Publisher's site
View description>>
Travelogues consist of textual information shared by tourists through web forums or other social media which often lack illustrations (images). In image sharing websites like Flicker, users can post images with rich textual information: `title', `tag' and `description'. The topics of travelogues usually revolve around beautiful sceneries. Corresponding landscape images recommended to these travelogues can enhance the vividness of reading. However, it is difficult to fuse such information because the text attached to each image has diverse meanings/views. In this paper, we propose an unsupervised Hybrid Multiple Kernel K-means (HMKKM) model to link images and travelogues through multiple views. Multi-view matrices are built to reveal the correlations between several respects. For further improving the performance, we add a regularisation based on textual similarity. To evaluate the effectiveness of the proposed method, a dataset is constructed from TripAdvisor and Flicker to find the related images for each travelogue. Experiment results demonstrate the superiority of the proposed model by comparison with other baselines.
Zhang, P, Wu, Q, Xu, J & Zhang, J 1970, 'Long-Term Person Re-identification Using True Motion from Videos', 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), IEEE, Lake Tahoe, NV, USA, pp. 494-502.
View/Download from: Publisher's site
View description>>
© 2018 IEEE. Most person re-identification approaches and benchmarks assume that pedestrians go across the surveillance network without significant appearance changes in a brief period, which explicitly restricts person re-identification to a short-term event and incurs inter-sample similarity measurement by appearance matching. However, pedestrians are likely to reappear in the surveillance network after a long-time interval (long-term) and change their wearing in many real-world scenarios. These scenarios inevitably cause appearances between subjects more ambiguous and indistinguishable. In this paper we consider these scenarios and propose a unified feature representation based on true motion cues from videos named FIne moTion encoDing (FITD). Our hypothesis is that people keep constant motion patterns under non-distraction walking condition. Therefore, the motion characteristics are more reliable than static appearance feature to describe a walking person. Particularly, we extract motion patterns hierarchically by encoding trajectory-aligned descriptors with Fisher vectors in a spatial-aligned pyramid. To verify benefits of the proposed FITD, we collect a new dataset typically for the long-term situations. Extensive experiments demonstrate the merits of our FITD especially for the long-term scenarios.
Zhang, X, Liu, Y, Zheng, Y, Zhao, Z, Li, J & Liu, Y 1970, 'Distinction Between Ships and Icebergs in SAR Images Using Ensemble Loss Trained Convolutional Neural Networks', Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Australasian Joint Conference on Artificial Intelligence, Springer International Publishing, Wellington, New Zealand, pp. 216-223.
View/Download from: Publisher's site
View description>>
© Springer Nature Switzerland AG 2018. With the phenomenon of global warming, more new shipping routes will be open and utilized by more and more ships in the polar regions, particularly in the Arctic. Synthetic aperture radar (SAR) has been widely used in ship and iceberg monitoring for maritime surveillance and safety in the Arctic waters. At present, compared with the object detection of ship or iceberg, the task of ship and iceberg distinction in SAR images is still in challenge. In this work, we propose a novel loss function called ensemble loss to train convolutional neural networks (CNNs), which is a convex function and incorporates the traits of cross entropy and hinge loss. The ensemble loss trained CNNs model for the distinction between ship and iceberg is evaluated on a real-world SAR data set, which can get a higher classification accuracy to 90.15%. Experiment on another real image data set also confirm the effectiveness of the proposed ensemble loss.
Zhang, Z, Wu, Q, Wang, Y & Chen, F 1970, 'Fine-Grained and Semantic-Guided Visual Attention for Image Captioning', 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), IEEE, Lake Tahoe, NV, USA, pp. 1709-1717.
View/Download from: Publisher's site
View description>>
© 2018 IEEE. Soft-attention is regarded as one of the representative methods for image captioning. Based on the end-to-end CNN-LSTM framework, it tries to link the relevant visual information on the image with the semantic representation in the text (i.e. captioning) for the first time. In recent years, there are several state-of-the-art methods published, which are motivated by this approach and include more elegant fine-tune operation. However, due to the constraints of CNN architecture, the given image is only segmented to fixed-resolution grid at a coarse level. The overall visual feature created for each grid cell indiscriminately fuses all inside objects and/or their portions. There is no semantic link among grid cells, although an object may be segmented into different grid cells. In addition, the large-area stuff (e.g. sky and beach) cannot be represented in the current methods. To tackle the problems above, this paper proposes a new model based on the FCN-LSTM framework which can segment the input image into a fine-grained grid. Moreover, the visual feature representing each grid cell is contributed only by the principal object or its portion in the corresponding cell. By adopting the pixel-wise labels (i.e. semantic segmentation), the visual representations of different grid cells are correlated to each other. In this way, a mechanism of fine-grained and semantic-guided visual attention is created, which can better link the relevant visual information with each semantic meaning inside the text through LSTM. Without using the elegant fine-tune, the comprehensive experiments show promising performance consistently across different evaluation metrics.
Zhang, Z, Wu, Q, Wang, Y & Chen, F 1970, 'Size-Invariant Attention Accuracy Metric for Image Captioning with High-Resolution Residual Attention', 2018 Digital Image Computing: Techniques and Applications (DICTA), 2018 Digital Image Computing: Techniques and Applications (DICTA), IEEE, Canberra, Australia, pp. 1-8.
View/Download from: Publisher's site
View description>>
© 2018 IEEE. Spatial visual attention mechanisms have achieved significant performance improvements for image captioning. To quantitatively evaluate the performances of attention mechanisms, the 'attention correctness' metric has been proposed to calculate the sum of attention weights generated for ground truth regions. However, this metric cannot consistently measure the attention accuracy among the element regions with large size variance. Moreover, its evaluations are inconsistent with captioning performances across different fine-grained attention resolutions. To address these problems, this paper proposes a size-invariant evaluation metric by normalizing the 'attention correctness' metric with the size percentage of the attended region. To demonstrate the efficiency of our size-invariant metric, this paper further proposes a high-resolution residual attention model that uses RefineNet as the Fully Convolutional Network (FCN) encoder. By using the COCO-Stuff dataset, we can achieve pixel-level evaluations on both object and 'stuff' regions. We use our metric to evaluate the proposed attention model across four high fine-grained resolutions (i.e., 27×27, 40×40, 60×60, 80×80). The results demonstrate that, compared with the 'attention correctness' metric, our size-invariant metric is more consistent with the captioning performances and is more efficient for evaluating the attention accuracy.
Zheng, Y, Peng, H, Zhang, X, Gao, X & Li, J 1970, 'Predicting Drug Targets from Heterogeneous Spaces using Anchor Graph Hashing and Ensemble Learning', 2018 International Joint Conference on Neural Networks (IJCNN), 2018 International Joint Conference on Neural Networks (IJCNN), IEEE, Rio de Janeiro, Brazil.
View/Download from: Publisher's site
View description>>
© 2018 IEEE. The in silico prediction of potential drug-targetinteractions is of critical importance in drug research. Existing computational methods have achieved remarkable prediction accuracy, however usually obtain poor prediction efficiency due to computational problems. To improve the prediction efficiency, we propose to predict drug targets based on inte- gration of heterogeneous features with anchor graph hashing and ensemble learning. First, we encode each drug as a 5682- bit vector, and each target as a 4198-bit vector using their heterogeneous features respectively. Then, these vectors are embedded into low-dimensional Hamming Space using anchor graph hashing. Next, we append hashing bits of a target to hashing bits of a drug as a vector to represent the drug-target pair. Finally, vectors of positive samples composed of known drug-target pairs and randomly selected negative samples are used to train and evaluate the ensemble learning model. The performance of the proposed method is evaluated on simulative target prediction of 1094 drugs from DrugBank. Ex- tensive comparison experiments demonstrate that the proposed method can achieve high prediction efficiency while preserving satisfactory accuracy. In fact, it is 99.3 times faster and only 0.001 less in AUC than the best literature method 'Pairwise Kernel Method'.
Zhou, Z, Liu, S, Xu, G, Xie, X, Yin, J, Li, Y & Zhang, W 1970, 'Knowledge-Based Recommendation with Hierarchical Collaborative Embedding', Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Pacific-Asia Conference on Knowledge Discovery and Data Mining, Springer International Publishing, Melbourne, Australia, pp. 222-234.
View/Download from: Publisher's site
View description>>
© 2018, Springer International Publishing AG, part of Springer Nature. Data sparsity is a common issue in recommendation systems, particularly collaborative filtering. In real recommendation scenarios, user preferences are often quantitatively sparse because of the application nature. To address the issue, we proposed a knowledge graph-based semantic information enhancement mechanism to enrich the user preferences. Specifically, the proposed Hierarchical Collaborative Embedding (HCE) model leverages both network structure and text info embedded in knowledge bases to supplement traditional collaborative filtering. The HCE model jointly learns the latent representations from user preferences, linkages between items and knowledge base, as well as the semantic representations from knowledge base. Experiment results on GitHub dataset demonstrated that semantic information from knowledge base has been properly captured, resulting improved recommendation performance.