Cao, L 2009, 'Actionable Knowledge Discovery' in Mehdi Khosrow-Pour (ed), Encyclopedia of Information Science and Technology, Second Edition, IGI Global, Hershey, PA, USA, pp. 8-13.
View/Download from: Publisher's site
View description>>
Actionable knowledge discovery is selected as one of the greatest challenges (Ankerst, 2002; Fayyad, Shapiro, & Uthurusamy, 2003) of next-generation knowledge discovery in database (KDD) studies (Han & Kamber, 2006). In the existing data mining, often mined patterns are nonactionable to real user needs. To enhance knowledge actionability, domain-related social intelligence is substantially essential (Cao et al., 2006b). The involvement of domain-related social intelligence into data mining leads to domaindriven data mining (Cao & Zhang, 2006a, 2007a), which complements traditional data-centered mining methodology. Domain-related social intelligence consists of intelligence of human, domain, environment, society and cyberspace, which complements data intelligence. The extension of KDD toward domain-driven data mining involves many challenging but promising research and development issues in KDD. Studies in regard to these issues may promote the paradigm shift of KDD from data-centered interesting pattern mining to domain-driven actionable knowledge discovery, and the deployment shift from simulated data set-based to real-life data and business environment-oriented as widely predicted.
Cao, L 2009, 'Introduction to Agent Mining Interaction and Integration' in Longbing Cao (ed), Data Mining and Multi-agent Integration, Springer US, New York, USA, pp. 3-36.
View/Download from: Publisher's site
View description>>
In recent years, more and morc researchers have been involved in research on both agent technology and data mining. A clear disciplinary effort has been activa ted toward removing the boundary between them, that is the interaction and integrati on be tween agent technology and data mining. We refer this to agent mining as a new area. The marriage of agents and data mining is driven by challenges faced by both communities, and the need of developing more advanced intelligence, in formation processi ng and systems. This chapter presents an overall picture of agent mining from the perspective of positioning it as an emerging area. We summarize the main driving forces, compleme ntary essence, di sci plinary framework , applications, case studies, and trends and directions, as well as brief observation on agent-driven data mining, data mining-driven agents, and mutual issues in agent mini ng. Arguably, we draw the following conclusions: (I) agent mining emerges as a new area in the scientific fam il y, (2) both agent technology and data mining can greatly benefit from agent mining, (3) it is very promising to resu lL in additional advancement in intelligent information processing and systems. However, as a new open area, there are many issues waiting for research and development from theoretical, technological and practical perspectives.
Cao, L & He, X-Z 2009, 'Developing actionable trading agents' in Jain, LC & Nguyen, NT (eds), Knowledge Processing and Decision Making in Agent-Based Systems, Springer, Berlin, Germany, pp. 193-215.
View/Download from: Publisher's site
View description>>
Trading agents are useful for developing and back-testing quality trading strategies to support smart trading actions in the market. However, most of the existing trading agent research oversimplifies trading strategies, and focuses on simulated ones. As a result, there exists a big gap between the deliverables and business needs when the developed strategies are deployed into the real life. Therefore, the actionable capability of developed trading agents is often very limited. This paper for the first time introduces effective approaches for optimizing and integrating multiple classes of strategies through trading agent collaboration. An integration and optimization approach is proposed to identify optimal trading strategy in each category, and further integrate optimal strategies crossing classes. Positions associated with these optimal strategies are recommended for trading agents to take actions in the market. Extensive experiments on a large quantity of real-life market data show that trading agents following the recommended strategies have great potential to obtain high benefits while low costs. This verifies that it is promising to develop trading agents toward workable and satisfying business needs.
Cao, L, Yu, PS, Zhang, C & Zhang, H 2009, 'Preface' in Cao, L, Yu, PS, Zhang, C & Zhang, H (eds), Data Mining for Business Applications, Springer, pp. v-vi.
Franco, UA, Paul, JK, Daniel, RC, Dachuan, G & Simeon, JS 2009, 'Microarray Data Mining: Selecting Trustworthy Genes with Gene Feature Ranking' in Data Mining for Business Applications, Springer US, New York, USA, pp. 159-168.
View/Download from: Publisher's site
View description>>
Gene expression datasets used in biomedical data mining frequently have two characteristics: they have many thousand attributes but only relatively few sample points and the measurements are noisy. In other words, individual expression measurements may be untrustworthy. Gene Feature Ranking (GFR) is a feature selection methodology that addresses these domain specific characteristics by selecting features (i.e. genes) based on two criteria: (i) how well the gene can discriminate between classes of patient and (ii) the trustworthiness of the microarray data associated with the gene. An example from the pediatric cancer domain demonstrates the use of GFR and compares its performance with a feature selection method that does not explicitly address the trustworthiness of the underlying data. © 2009 Springer US.
Longbing, C 2009, 'Introduction to Domain Driven Data Mining' in Cao, L, Yu, PS, Zhang, C & Zhang, H (eds), Data Mining for Business Applications, Springer US, New York, USA, pp. 3-10.
View/Download from: Publisher's site
View description>>
The mainstream data mining faces critical challenges and lacks of soft power in solving real-world complex problems when deployed. Following the paradigm shift from 'data mining' to 'knowledge discovery', we believe much more thorough efforts are essential for promoting the wide acceptance and employment of knowledge discovery in real-world smart decision making. To this end, we expect a new paradigm shift from 'data-centered knowledge discovery' to 'domain-driven actionable knowledge discovery'. In the domain-driven actionable knowledge discovery, ubiquitous intelligence must be involved and meta-synthesized into the mining process, and an actionable knowledge discovery-based problem-solving system is formed as the space for data mining. This is the motivation and aim of developing Domain Driven Data Mining (D 3 M for short). This chapter briefs the main reasons, ideas and open issues in D 3 M. © 2009 Springer US.
Stoianoff, NP 2009, 'The Recognition of Traditional Knowledge under Australian Biodiscovery Regimes: Why Bother with Intellectual Property Rights?' in Antons, C (ed), Traditional Knowledge, Traditional Cultural Expressions and Intellectual Property Law in the Asia-Pacific Region, Kluwer Law International, The Hague, Netherlands, pp. 293-311.
View description>>
NA
Stoianoff, NP & Kelly, AH 2009, 'Conserving Native Vegetation on Private Land: Subsidizing Sustainable Use of Biodiversity?' in Deketelaere, K, Milne, JE, Kreiser, L & Ashiabor, H (eds), Critical Issues in Environmental Taxation. International and Comparative Perspectives: Volume IV, Oxford University Press, Oxford, UK, pp. 299-315.
View description>>
TheConvention on Biological Diversity1992 (the Biodiversity Convention) has 13.01 asitsprimary objective the conservation ofbiological diversity;' Running a dose second is the objective of sustainable use of biological divcrsity.2 Simultaneous achievement ofsuch objectives often runs contrary to the desires ofJarge landowners in Australia, particularly when such landowners are engaged in primary production industries.
Whitney, M & Ryan, L 2009, 'Quantifying Dose–Response Uncertainty Using Bayesian Model Averaging' in Uncertainty Modeling in Dose Response, Wiley, pp. 165-179.
View/Download from: Publisher's site
Whitney, M & Ryan, L 2009, 'Response to Comments' in Uncertainty Modeling in Dose Response, Wiley, pp. 194-196.
View/Download from: Publisher's site
Wu, S, Zhao, Y, Zhang, H, Zhang, C, Cao, L & Bohlscheid, H 2009, 'Debt Detection in Social Security by Adaptive Sequence Classification' in Karagiannis, D & Jin, Z (eds), Knowledge Science, Engineering and Management, Springer Berlin Heidelberg, Germany, pp. 192-203.
View/Download from: Publisher's site
View description>>
Debt detection is important for improving payment accuracy in social security. Since debt detection from customer transaction data can be generally modelled as a fraud detection problem, a straightforward solution is to extract features from transaction sequences and build a sequence classifier for debts. For long-running debt detections, the patterns in the transaction sequences may exhibit variation from time to time, which makes it imperative to adapt classification to the pattern variation. In this paper, we present a novel adaptive sequence classification framework for debt detection in a social security application. The central technique is to catch up with the pattern variation by boosting discriminative patterns and depressing less discriminative ones according to the latest sequence data. © 2009 Springer-Verlag Berlin Heidelberg.
Ye, Y, He, S, Li, J, Jia, W & Wu, Q 2009, 'Rough Sets and Knowledge Technology' in Wen, P, Li, Y, Polkowski, L, Yao, Y, Tsumoto, S & Wang, G (eds), Rough Sets and Knowledge Technology, Springer Berlin Heidelberg, Berlin, Germany, pp. 571-578.
View/Download from: Publisher's site
View description>>
Spiral Architecture, a hexagonal image structure is a novel and powerful approach to machine vision system. The pixels on Spiral architecture are geometrically arranged using a 1D (Spiral) addressing scheme in an ascending order along a spiral-like curve. Spiral addition and Spiral multiplication are defined based on the Spiral addresses on Spiral Architecture. These two fundamental operations result in fast and easy translation, rotation and separation on images, and hence play very important roles for image processing on Spiral Architecture. Moreover, 2D coordinates according to rows and columns defined on Spiral Structure provide a good mapping to the ordinary 2D coordinates defined on the common square image structure. Therefore, how to convert the 1D Spiral addresses from and to the 2D coordinates on Spiral Architecture has become very important to apply the theory developed on a hexagonal image structure for image processing (e.g., rotation). In this paper, we perform a fast way to correctly locate any hexagonal pixel when its Spiral address is known, and compute the Spiral address of any hexagonal pixel when its location is known. As an illustration of the use of conversions, we demonstrate the accurate image translation and rotation using experimental results.
Zhang, Y & Xu, G 2009, 'Singular Value Decomposition' in Encyclopedia of Database Systems, Springer US, pp. 2657-2658.
View/Download from: Publisher's site
Zhao, Y, Cao, L, Zhang, H & Zhang, C 2009, 'Handbook of Research on Innovations in Database Technologies and Applications' in Ferraggine, VE, Doorn, JH & Rivero, LC (eds), Handbook of Research on Innovations in Database Technologies and Applications: Current and Future Tr, IGI Global, USA, pp. 562-572.
View/Download from: Publisher's site
View description>>
Clustering is one of the most important techniques in data mining. This chapter presents a survey of popular approaches for data clustering, including well-known clustering techniques, such as partitioning clustering, hierarchical clustering, density-based clustering and grid-based clustering, and recent advances in clustering, such as subspace clustering, text clustering and data stream clustering. The major challenges and future trends of data clustering will also be introduced in this chapter. The remainder of this chapter is organized as follows. The background of data clustering will be introduced in Section 2, including the definition of clustering, categories of clustering techniques, features of good clustering algorithms, and the validation of clustering. Section 3 will present main approaches for clustering, which range from the classic partitioning and hierarchical clustering to recent approaches of bi-clustering and semisupervised clustering. Challenges and future trends will be discussed in Section 4, followed by the conclusions in the last section.
Zhao, Y, Zhang, H, Cao, L, Zhang, H, Bohlscheid, H, Ou, Y & Zhang, C 2009, 'Data Mining Applications in Social Security' in Cao, L, Yu, PS, Zhang, C & Zhang, H (eds), Data Mining for Business Applications, Springer US, New York, USA, pp. 81-96.
View/Download from: Publisher's site
View description>>
This chapter presents four applications of data mining in social security. The first is an application of decision tree and association rules to find the demographic patterns of customers. Sequence mining is used in the second application to find activity sequence patterns related to debt occurrence. In the third application, combined association rules are mined from heterogeneous data sources to discover patterns of slow payers and quick payers. In the last application, clustering and analysis of variance are employed to check the effectiveness of a new policy. © 2009 Springer US.
Cao, L & He, T 2009, 'Developing actionable trading agents', Knowledge and Information Systems, vol. 18, no. 2, pp. 183-198.
View/Download from: Publisher's site
View description>>
Trading agents are useful for developing and back-testing quality trading strategies to support smart trading actions in the market. However, most of the existing trading agent research oversimplifies trading strategies, and focuses on simulated ones. As a result, there exists a big gap between the deliverables and business needs when the developed strategies are deployed into the real life. Therefore, the actionable capability of developed trading agents is often very limited. This paper for the first time introduces effective approaches for optimizing and integrating multiple classes of strategies through trading agent collaboration. An integration and optimization approach is proposed to identify optimal trading strategy in each category, and further integrate optimal strategies crossing classes. Positions associated with these optimal strategies are recommended for trading agents to take actions in the market. Extensive experiments on a large quantity of real-life market data show that trading agents following the recommended strategies have great potential to obtain high benefits while low costs. This verifies that it is promising to develop trading agents toward workable and satisfying business needs. © Springer-Verlag London Limited 2008.
Cao, L & Yu, P 2009, 'Behavior Informatics: An Informatics Perspective for Behavior Studies', IEEE Computational Intelligence Bulletin, vol. 10, no. 1, pp. 6-11.
View description>>
Behavior is increasingly recognized as a key entity in business intelligence and problem-solving. Even though behavior analysis has been extensively investigated in social sciences and behavior sciences, in which qualitative and psychological methods have been the main means, nevertheless to conduct formal representation and deep quantitative analysis it is timely to investigate behavior from the informatics perspective. This article highlights the basic framework of behavior informatics, which aims to supply methodologies, approaches, means and tools for formal behavior modeling and representation, behavioral data construction, behavior impact modeling, behavior network analysis, behavior pattern analysis, behavior presentation, management and use. Behavior informatics can greatly complement existing studies in terms of providing more formal, quantitative and computable mechanisms and tools for deep understanding and use.
Cao, L, Gorodetsky, V & Mitkas, PA 2009, 'Agents and data mining', IEEE Intelligent Systems, vol. 24, no. 3, pp. 14-15.
View/Download from: Publisher's site
Chen, L, Bhowmick, SS & Nejdl, W 2009, 'NEAR-Miner', Proceedings of the VLDB Endowment, vol. 2, no. 1, pp. 1150-1161.
View/Download from: Publisher's site
View description>>
Web archives preserve the history of autonomous Web sites and are potential gold mines for all kinds of media and business analysts. The most common Web archiving technique uses crawlers to automate the process of collecting Web pages. However, (re)downloading entire collection of pages periodically from a large Web site is unfeasible. In this paper, we take a step towards addressing this problem. We devise a data mining-driven policy for selectively (re)downloading Web pages that are located in hierarchical directory structures which are believed to have changed significantly (e.g., a substantial percentage of pages are inserted to/removed from the directory). Consequently, there is no need to download and maintain pages that have not changed since the last crawl as they can be easily retrieved from the archive. In our approach, we propose an off-line data mining algorithm called near- Miner that analyzes the evolution history of Web directory structures of the original Web site stored in the archive and mines negatively correlated association rules (near) between ancestor-descendant Web directories. These rules indicate the evolution correlations between Web directories. Using the discovered rules, we propose an efficient Web archive maintenance algorithm called warm that optimally skips the subdirectories (during the next crawl) which are negatively correlated with it in undergoing significant changes. Our experimental results with real data show that our approach improves the efficiency of the archive maintenance process signifi...
Chen, W, Li, J & Lu, P 2009, 'Progress of photonic crystal fibers and their applications', Frontiers of Optoelectronics in China, vol. 2, no. 1, pp. 50-57.
View/Download from: Publisher's site
Clark, CR, Kawachi, I, Ryan, L, Ertel, K, Fay, ME & Berkman, LF 2009, 'Perceived neighborhood safety and incident mobility disability among elders: the hazards of poverty', BMC PUBLIC HEALTH, vol. 9, no. 162, pp. 0-0.
View/Download from: Publisher's site
View description>>
Background. We investigated whether lack of perceived neighborhood safety due to crime, or living in high crime neighborhoods was associated with incident mobility disability in elderly populations. We hypothesized that low-income elders and elders at retirement age (65 74) would be at greatest risk of mobility disability onset in the face of perceived or measured crime-related safety hazards. Methods. We conducted the study in the New Haven Established Populations for Epidemiologic Studies of the Elderly (EPESE), a longitudinal cohort study of community-dwelling elders aged 65 and older who were residents of New Haven, Connecticut in 1982. Elders were interviewed beginning in 1982 to assess mobility (ability to climb stairs and walk a half mile), perceptions of their neighborhood safety due to crime, annual household income, lifestyle characteristics (smoking, alcohol use, physical activity), and the presence of chronic co-morbid conditions. Additionally, we collected baseline data on neighborhood crime events from the New Haven Register newspaper in 1982 to measure local area crime rates at the census tract level. Results. At baseline in 1982, 1,884 elders were without mobility disability. After 8 years of follow-up, perceiving safety hazards was associated with increased risk of mobility disability among elders at retirement age whose incomes were below the federal poverty line (HR 1.56, 95% CI 1.02 2.37). No effect of perceived safety hazards was found among elders at retirement age whose incomes were above the poverty line. No effect of living in neighborhoods with high crime rates (measured by newspaper reports) was found in any sub-group. Conclusion. Perceiving a safety hazard due to neighborhood crime was associated with increased risk of incident mobility disability among impoverished elders near retirement age. Consistent with prior literature, retirement age appears to be a vulnerable period with respect to the effect of neighborhood conditio...
Du, C, Yang, J, Wu, Q & Zhang, T 2009, 'Face recognition using message passing based clustering method', Journal of Visual Communication and Image Representation, vol. 20, no. 8, pp. 608-613.
View/Download from: Publisher's site
View description>>
Traditional subspace analysis methods are inefficient and tend to be affected by noise as they compare the test image to all training images, especifically when there are large numbers of training images. To solve such problem, we propose a fast face recognition (FR) technique called APLDA by combining a novel clustering method affinity propagation (AP) with linear discriminant analysis (LDA). By using AP on the reduced features derived from LDA, a representative face image for each subject can be reached. Thus, our APLDA uses only the representative images rather than all training images for identification. Obviously, APLDA is much more computationally efficient than Fisherface. Also, unlike Fisherface who uses pattern classifier for identification, APLDA performs the identification using AP once again to cluster the test image into one of the representative images. Experimental results also indicate that APLDA outperforms Fisherface in terms of recognition rate. © 2009 Elsevier Inc.
Forno, E, Onderdonk, AB, McCracken, J, Litonjua, AA, Laskey, D, Delaney, ML, DuBois, AM, Gold, DR, Ryan, LM, Weiss, ST & Celedon, JC 2009, 'Diversity of the Gut Microbiota and Eczema in Infants', AMERICAN JOURNAL OF RESPIRATORY AND CRITICAL CARE MEDICINE, vol. 179.
Gabrys, B & Anguita, D 2009, 'Nature-inspired learning and adaptive systems', Natural Computing, vol. 8, no. 2, pp. 197-198.
View/Download from: Publisher's site
Guha, S, Ryan, L & Morara, M 2009, 'Gauss-Seidel Estimation of Generalized Linear Mixed Models With Application to Poisson Modeling of Spatially Varying Disease Rates', JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, vol. 18, no. 4, pp. 818-837.
View/Download from: Publisher's site
View description>>
Generalized linear mixed models (GLMMs) are often fit by computational procedures such as penalized quasi-likelihood (PQL). Special cases of GLMMs are generalized linear models (GLMs), which are often fit using algorithms like iterative weighted least squares (IWLS). High computational costs and memory space constraints make it difficult to apply these iterative procedures to datasets having a very large number of records. We propose a computationally efficient strategy based on the Gauss-Seidel algorithm that iteratively fits submodels of the GLMM to collapsed versions of the data. The strategy is applied to investigate the relationship between ischemic heart disease, socioeconomic status, and age/gender category in New South Wales, Australia, based on outcome data consisting of approximately 33 million records. For Poisson and binomial regression models, the Gauss-Seidel approach is found to substantially outperform existing methods in terms of maximum analyzable sample size. Remarkably, for both models, the average time per iteration and the total time until convergence of the Gauss-Seidel procedure are less than 0.3% of the corresponding times for the IWLS algorithm. Platform-independent pseudo-code for fitting GLMS, as well as the source code used to generate and analyze the datasets in the simulation studies, are available online as supplemental materials. © 2009 American Statistical Association, Institute of Mathematical Statistics, and Interface Foundation of North America.
Gunes, H & Piccardi, M 2009, 'Automatic Temporal Segment Detection and Affect Recognition From Face and Body Display', IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, vol. 39, no. 1, pp. 64-84.
View/Download from: Publisher's site
View description>>
Psychologists have long explored mechanisms with which humans recognize other humans' affective states from modalities, such as voice and face display. This exploration has led to the identification of the main mechanisms, including the important role played in the recognition process by the modalities' dynamics. Constrained by the human physiology, the temporal evolution of a modality appears to be well approximated by a sequence of temporal segments called onset, apex, and offset. Stemming from these findings, computer scientists, over the past 15 years, have proposed various methodologies to automate the recognition process. We note, however, two main limitations to date. The first is that much of the past research has focused on affect recognition from single modalities. The second is that even the few multimodal systems have not paid sufficient attention to the modalities' dynamics: The automatic determination of their temporal segments, their synchronization to the purpose of modality fusion, and their role in affect recognition are yet to be adequately explored. To address this issue, this paper focuses on affective face and body display, proposes a method to automatically detect their temporal segments or phases, explores whether the detection of the temporal phases can effectively support recognition of affective states, and recognizes affective states based on phase synchronization/alignment. The experimental results obtained show the following: 1) affective face and body displays are simultaneous but not strictly synchronous; 2) explicit detection of the temporal phases can improve the accuracy of affect recognition; 3) recognition from fused face and body modalities performs better than that from the face or the body modality alone; and 4) synchronized feature-level fusion achieves better performance than decision-level fusion.
Hart, JE, Yanosky, JD, Puett, RC, Ryan, L, Dockery, DW, Smith, TJ, Garshick, E & Laden, F 2009, 'Spatial Modeling of PM10 and NO2 in the Continental United States, 1985-2000', ENVIRONMENTAL HEALTH PERSPECTIVES, vol. 117, no. 11, pp. 1690-1696.
View/Download from: Publisher's site
View description>>
BACKGROUND: Epidemiologic studies of air pollution have demonstrated a link between long-term air pollution exposures and mortality. However, many have been limited to city-specific average pollution measures or spatial or land-use regression exposure models in small geographic areas. OBJECTIVES: Our objective was to develop nationwide models of annual exposure to particulate matter < 10 μm in diameter (PM10) and nitrogen dioxide during 1985-2000. METHODS: We used generalized additive models (GAMs) to predict annual levels of the pollutants using smooth spatial surfaces of available monitoring data and geographic information system-derived covariates. Model performance was determined using a cross-validation (CV) procedure with 10% of the data. We also compared the results of these models with a commonly used spatial interpolation, inverse distance weighting. RESULTS: For PM10, distance to road, elevation, proportion of low-intensity residential, high-intensity residential, and industrial, commercial, or transportation land use within 1 km were all statistically significant predictors of measured PM10 (model R2 = 0.49, CV R2 = 0.55). Distance to road, population density, elevation, land use, and distance to and emissions of the nearest nitrogen oxides-emitting power plant were all statistically significant predictors of measured NO2 (model R2 = 0.88, CV R2 = 0.90). The GAMs performed better overall than the inverse distance models, with higher CV R2 and higher precision. CONCLUSIONS: These models provide reasonably accurate and unbiased estimates of annual exposures for PM10 and NO2. This approach provides the spatial and temporal variability necessary to describe exposure in studies assessing the health effects of chronic air pollution.
Jinyan Li & Junghwan Kim 2009, 'Data-Aided Synchronization for MF-TDMA Multi-Carrier Demultiplexer/Demodulator (MCDD)', IEEE Transactions on Broadcasting, vol. 55, no. 3, pp. 623-632.
View/Download from: Publisher's site
Juszczyszyn, K, Musiał, K, Kazienko, P & Gabrys, B 2009, 'Temporal changes in local topology of an email-based social network', Computing and Informatics, vol. 28, no. 6, pp. 763-779.
View description>>
The dynamics of complex social networks has become one of the research areas of growing importance. The knowledge about temporal changes of the network topology and characteristics is crucial in networked communication systems in which accurate predictions are important. The local network topology can be described by the means of network motifs which are small subgraphs - usually containing from 3 to 7 nodes. They were shown to be useful for creating profiles that reveal several properties of the network. In this paper, the time-varying characteristics of social networks, such as the number of nodes and edges as well as clustering coefficients and different centrality measures are investigated. At the same time, the analysis of three-node motifs (triads) was used to track the temporal changes in the structure of a large social network derived from e-mail communication between university employees. We have shown that temporal changes in local connection patterns of the social network are indeed correlated with the changes in the clustering coefficient as well as various centrality measures values and are detectable by means of motifs analysis. Together with robust sampling network motifs can provide an appealing way to monitor and assess temporal changes in large social networks.
Kadlec, P & Gabrys, B 2009, 'Architecture for development of adaptive on-line prediction models', Memetic Computing, vol. 1, no. 4, pp. 241-269.
View/Download from: Publisher's site
View description>>
This work presents an architecture for the development of on-line prediction models. The architecture defines unified modular environment based on three concepts from machine learning, these are: (i) ensemble methods, (ii) local learning, and (iii) meta learning. The three concepts are organised in a three layer hierarchy within the architecture. For the actual prediction making any data-driven predictive method such as artificial neural network, support vector machines, etc. can be implemented and plugged in. In addition to the predictive methods, data pre-processing methods can also be implemented as plug-ins. Models developed according to the architecture can be trained and operated in different modes. With regard to the training, the architecture supports the building of initial models based on a batch of training data, but if this data is not available the models can also be trained in incremental mode. In a scenario where correct target values are (occasionally) available during the run-time, the architecture supports life-long learning by providing several adaptation mechanisms across the three hierarchical levels. In order to demonstrate its practicality, we show how the issues of current soft sensor development and maintenance can be effectively dealt with by using the architecture as a construction plan for the development of adaptive soft sensing algorithms. © Springer-Verlag 2009.
Kadlec, P, Gabrys, B & Strandt, S 2009, 'Data-driven Soft Sensors in the process industry', Computers & Chemical Engineering, vol. 33, no. 4, pp. 795-814.
View/Download from: Publisher's site
Kelly, AH & Stoianoff, NP 2009, 'Biodiversity conservation, local government finance and differential rates: The good, the bad and the potentially attractive', Environmental and Planning Law Journal, vol. 26, no. 1, pp. 5-18.
View description>>
Local councils in New South Wales and across Australia are constrained by insufficient financial resources. This inhibits functional expansion and service improvement in non-traditional but growing areas of operation. A ready example is biodiversity conservation, where councils are under pressure to lift their game. The focus here is on local government's key funding source, namely 'rating', and its implications for protecting the natural environment. As a traditional property tax, rating generally falls outside the biodiversity conservation toolbox. This raises the idea of utilising one specific aspect of rating - the categorisation and sub-categorisation of rated land - as a potential mechanism for conservation purposes. In order to achieve this, statutory and policy change is necessary, including review of the longstanding rating benefit given to farmlands. The crux of this article is the potential benefits of introducing a new rating category for conservation purposes.
Kile, ML, Hoffman, E, Hsueh, Y-M, Afroz, S, Quamruzzaman, Q, Rahman, M, Mahiuddin, G, Ryan, L & Christiani, DC 2009, 'Variability in Biomarkers of Arsenic Exposure and Metabolism in Adults over Time', ENVIRONMENTAL HEALTH PERSPECTIVES, vol. 117, no. 3, pp. 455-460.
View/Download from: Publisher's site
View description>>
Background: Urinary arsenic metabolites (UAs) are used as biomarkers of exposure and metabolism. Ojectives: To characterize inter- and intraindividual variability in UAs in healthy individuals. Methods: In a longitudinal study conducted in Bangladesh, we collected water and spot urine samples from 196 participants every 3 months for 2 years. Water arsenic (As) was measured by inductively coupled plasma-mass spectrometry and urinary As [arsenite, arsenate, monomethylarsonic acid (MMA), and dimethylarsinic acid (DMA)] were detected using high-performance liquid chromatography-hydride-generated atomic absorption spectrometry. We used linear mixed-effects models to compute variance components and evaluate the association between UAs and selected factors. Results: The concentrations of UAs were fairly reproducible within individuals, with intraclass correlation coefficients (ICCs) of 0.41, 0.35, 0.47, and 0.49 for inorganic As (InAs), MMA, DMA, and total urinary As (TUA). However, when expressed as a ratio, the percent InAs (%InAs), %MMA, and %DMA were poorly reproducible within individuals, with ICCs of 0.16, 0.16, and 0.17, respectively. Arsenic metabolism was significantly associated with sex, exposure, age, smoking, chewing betel nut, urinary creatinine, and season. Specificity and sensitivity analyses showed that a single urine sample adequately classified a participant's urinary As profile as high or low, but TUA had only moderate specificity for correctly classifying drinking water exposures. Conclusions: Epidemiologic studies should use both urinary As concentrations and the relative proportion of UAs to minimize measurement error and to facilitate interpretation of factors that influence As metabolism.
Li, J & Liu, Q 2009, '‘Double water exclusion’: a hypothesis refining the O-ring theory for the hot spots at protein interfaces', Bioinformatics, vol. 25, no. 6, pp. 743-750.
View/Download from: Publisher's site
View description>>
Abstract Motivation: The O-ring theory reveals that the binding hot spot at a protein interface is surrounded by a ring of residues that are energetically less important than the residues in the hot spot. As this ring of residues is served to occlude water molecules from the hot spot, the O-ring theory is also called ‘water exclusion’ hypothesis. We propose a ‘double water exclusion’ hypothesis to refine the O-ring theory by assuming the hot spot itself is water-free. To computationally model a water-free hot spot, we use a biclique pattern that is defined as two maximal groups of residues from two chains in a protein complex holding the property that every residue contacts with all residues in the other group. Methods and Results: Given a chain pair A and B of a protein complex from the Protein Data Bank (PDB), we calculate the interatomic distance of all possible pairs of atoms between A and B. We then represent A and B as a bipartite graph based on these distance information. Maximal biclique subgraphs are subsequently identified from all of the bipartite graphs to locate biclique patterns at the interfaces. We address two properties of biclique patterns: a non-redundant occurrence in PDB, and a correspondence with hot spots when the solvent-accessible surface area (SASA) of a biclique pattern in the complex form is small. A total of 1293 biclique patterns are discovered which have a non-redundant occurrence of at least five, and which each have a minimum two and four residues at the two sides. Through extensive queries to the HotSprint and ASEdb databases, we verified that biclique patterns are rich of true hot residues. Our algorithm and results provide a new way to identify hot spots by examining proteins' structural data. Availability: The biclique mining algorithm is available at http://www.ntu.edu.sg/home/jyli/dwe.html. ...
Lindahl, M, José, M, Jurado, L, Pilar, M, Ramos, G & Carmen, M 2009, 'A decision making method for educational management based on distance measures', Revista de Metodos Cuantitativos para la Economia y la Empresa, vol. 8, pp. 29-49.
View description>>
We develop a new approach for decision making in educational management based on the use of distance measures. We focus on the selection of a studies plan from the perspective of an academic institution. We try to develop this approach showing the benefits of establishing an ideal plan that we compare with the available alternatives. We use the Minkowski distance, the ordered weighted averaging (OWA) operator and the interval numbers. The use of the Minkowski distance allows to make comparisons between the ideal plan and the available ones in the market. The OWA operator is an aggregation operator that provides a parameterized family of aggregation operators that includes the maximum, the minimum and the average criteria, among oth- ers. And the interval numbers is a very useful technique to represent the information when the environment is very complex, because it gives all the possible results from the minimum to the maximum. We introduce a new aggregation operator called the uncertain generalized ordered weighted aver- aging distance (UGOWAD) operator. It is a distance aggregation operator that uses the main characteristics of the Minkowski distance, the OWA op- erator and the interval numbers. We develop an illustrative example where we can see the usefulness of the UGOWAD operator to select a studies plan in education management. The main advantage of using the UGOWAD is that we can consider a wide range of distance aggregation methods in the decision problem. Then, the decision maker gets a more complete view of the decision problem, being able to select the alternative that better fits the interests.
Liu, G, Sim, K, Li, J & Wong, L 2009, 'Efficient mining of distance‐based subspace clusters', Statistical Analysis and Data Mining: The ASA Data Science Journal, vol. 2, no. 5-6, pp. 427-444.
View/Download from: Publisher's site
View description>>
AbstractTraditional similarity measurements often become meaningless when dimensions of datasets increase. Subspace clustering has been proposed to find clusters embedded in subspaces of high‐dimensional datasets. Many existing algorithms use a grid‐based approach to partition the data space into nonoverlapping rectangle cells, and then identify connected dense cells as clusters. The rigid boundaries of the grid‐based approach may cause a real cluster to be divided into several small clusters. In this paper, we propose to use a sliding‐window approach to partition the dimensions to preserve significant clusters. We call this model nCluster model. The sliding‐window approach generates more bins than the grid‐based approach, thus it incurs higher mining cost. We develop a deterministic algorithm, called MaxnCluster, to mine nClusters efficiently. MaxnCluster uses several techniques to speed up the mining, and it produces only maximal nClusters to reduce result size. Non‐maximal nClusters are pruned without the need of storing the discovered nClusters in the memory, which is key to the efficiency of MaxnCluster. Our experiment results show that (i) the nCluster model can indeed preserve clusters that are shattered by the grid‐based approach on synthetic datasets; (ii) the nCluster model produces more significant clusters than the grid‐based approach on two real gene expression datasets and (iii) MaxnCluster is efficient in mining maximal nClusters. Copyright © 2009 Wiley Periodicals, Inc. Statistical Analysis and Data Mining 2: 427‐444, 2009
Longbing Cao, Ruwei Dai & Mengchu Zhou 2009, 'Metasynthesis: M-Space, M-Interaction, and M-Computing for Open Complex Giant Systems', IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans, vol. 39, no. 5, pp. 1007-1021.
View/Download from: Publisher's site
View description>>
The studies of complex systems have been recognized as one of the greatest challenges for current and future science and technology. Open complex giant systems (OCGSs) are a family of specially complex systems with system complexities such as openness, human involvement, societal characteristic, and intelligence emergence. They greatly challenge multiple disciplines such as system sciences, system engineering, cognitive sciences, information systems, artificial intelligence, and computer sciences. As a result, traditional problem-solving methodologies can help deal with them but are far from a mature solution methodology. The theory of qualitative-to-quantitative metasynthesis has been proposed as a breakthrough and effective methodology for the understanding and problem solving of OCGSs. In this paper, we propose the concepts of M-Interaction, M-Space, and M-Computing which are three key components for studying OCGS and building problem-solving systems. M-Interaction forms the main problem-solving mechanism of qualitative-to-quantitative metasynthesis; M-Space is the OCGS problem-solving system embedded with M-Interactions, while M-Computing consists of engineering approaches to the analysis, design, and implementation of M-Space and M-Interaction. We discuss the theoretical framework, problem-solving process, social cognitive evolution, intelligence emergence, and pitfalls of certain types of cognitions in developing M-Space and M-Interaction from the perspectives of cognitive sciences and social cognitive interaction. These can help one understand complex systems and develop effective problem-solving methodologies. © 2009 IEEE.
LU, S, ZHANG, J & FENG, DD 2009, 'DETECTING GHOST AND LEFT OBJECTS IN SURVEILLANCE VIDEO', International Journal of Pattern Recognition and Artificial Intelligence, vol. 23, no. 07, pp. 1503-1525.
View/Download from: Publisher's site
View description>>
This paper proposes an efficient method for detecting ghost and left objects in surveillance video, which, if not identified, may lead to errors or wasted computational power in background modeling and object tracking in video surveillance systems. This method contains two main steps: the first one is to detect stationary objects, which narrows down the evaluation targets to a very small number of regions in the input image; the second step is to discriminate the candidates between ghost and left objects. For the first step, we introduce a novel stationary object detection method based on continuous object tracking and shape matching. For the second step, we propose a fast and robust inpainting method to differentiate between ghost and left objects by reconstructing the real background using the candidate's corresponding regions in the current input and background image. The effectiveness of our method has been validated by experiments over a variety of video sequences and comparisons with existing state-of-art methods.
Meeker, JD, Missmer, SA, Altshul, L, Vitonis, AF, Ryan, L, Cramer, DW & Hauser, R 2009, 'Serum and follicular fluid organochlorine concentrations among women undergoing assisted reproduction technologies', ENVIRONMENTAL HEALTH, vol. 8, no. 32, pp. 32-10.
View/Download from: Publisher's site
View description>>
BACKGROUND: Exposure to persistent organic pollutants, including polychlorinated biphenyls (PCBs) and organochlorine pesticides, is widespread among the general population. There is evidence of adverse effects on reproduction and early pregnancy in relation to organochlorine exposure but human studies remain limited. The increased use of assisted reproductive technologies (ART) presents unique opportunities for the assessment of environmental influences on early pregnancy outcomes not otherwise observable in humans, but studies need to be designed to maximize the efficiency of the exposure data collected while minimizing exposure measurement error. METHODS: The present study was conducted to assess the correlation between concentrations of organochlorines in serum and follicular fluid samples collected from a subset of women undergoing ART in a large study that took place between 1994 and 2003, as well as the temporal reliability of serum organochlorine concentrations among women undergoing multiple ART cycles in the study. PCB congeners (118, 138, 153, and 180), 1,1,1-trichloro-2,2-bis(p-chlorophenyl)ethane (p,p'-DDT), the DDT metabolite p,p'-DDE, hexachlorobenzene (HCB), oxychlordane, trans-nonachlor and mirex were measured in 72 follicular fluid samples and 265 serum samples collected from 110 women. RESULTS: Organochlorine concentrations in paired serum and follicular fluid samples were correlated, with Pearson and Spearman coefficients ranging from 0.60 to 0.92. Serum organochlorine concentrations were two- to three-fold greater than in follicular fluid, and a significant inverse trend was observed in the distribution of follicular fluid:serum ratios with increasing molecular weight of the compound (p-value for trend < 0.0001). Serum organochlorine concentrations were highly reliable over the course of several months, with intraclass correlation coefficients ranging from 0.86 to 0.98. Finally, there was evidence for a declining trend in organochlor...
MERIGO, J & GILLAFUENTE, A 2009, 'The induced generalized OWA operator', Information Sciences, vol. 179, no. 6, pp. 729-741.
View/Download from: Publisher's site
Merigó, JM & Casanovas, M 2009, 'Geometric operators in decision making with minimization of regret', World Academy of Science, Engineering and Technology, vol. 39, pp. 514-521.
View description>>
We study different types of aggregation operators and the decision making process with minimization of regret. We analyze the original work developed by Savage and the recent work developed by Yager that generalizes the MMR method creating a parameterized family of minimal regret methods by using the ordered weighted averaging (OWA) operator. We suggest a new method that uses different types of geometric operators such as the weighted geometric mean or the ordered weighted geometric operator (OWG) to generalize the MMR method obtaining a new parameterized family of minimal regret methods. The main result obtained in this method is that it allows to aggregate negative numbers in the OWG operator. Finally, we give an illustrative example.
Merigó, JM & Casanovas, M 2009, 'Induced aggregation operators in decision making with the Dempster-Shafer belief structure', International Journal of Intelligent Systems, vol. 24, no. 8, pp. 934-954.
View/Download from: Publisher's site
Merigó, JM & Gil-Lafuente, AM 2009, 'OWA operators in generalized distances', World Academy of Science, Engineering and Technology, vol. 33, pp. 866-873.
View description>>
Different types of aggregation operators such as the ordered weighted quasi-arithmetic mean (Quasi-OWA) operator and the normalized Hamming distance are studied. We introduce the use of the OWA operator in generalized distances such as the quasiarithmetic distance. We will call these new distance aggregation the ordered weighted quasi-arithmetic distance (Quasi-OWAD) operator. We develop a general overview of this type of generalization and study some of their main properties such as the distinction between descending and ascending orders. We also consider different families of Quasi-OWAD operators such as the Minkowski ordered weighted averaging distance (MOWAD) operator, the ordered weighted averaging distance (OWAD) operator, the Euclidean ordered weighted averaging distance (EOWAD) operator, the normalized quasi-arithmetic distance, etc.
Ou, Y, Cao, L & Zhang, C 2009, 'Adaptive Anomaly Detection of Coupled Activity Sequences', The IEEE Intelligent Informatics Bulletin, vol. 10, no. 1, pp. 12-16.
View description>>
Many real-life applications often involve multiple sequences, which are coupled with each other. It is unreasonable to either study the multiple coupled sequences separately or simply merge them into one sequence, because the information about their interacting relationships would be lost. Furthermore, such coupled sequences also have frequently significant changes which are likely to degrade the performance of trained model. Taking the detection of abnormal trading activity patterns in stock markets as an example, this paper proposes a Hidden Markov Model-based approach to address the above two issues. Our approach is suitable for sequence analysis on multiple coupled sequences and can adapt to the significant sequence changes automatically. Substantial experiments conducted on a real dataset show that our approach is effective.
Owen, MH, Ryan, LM & Holmes, LB 2009, 'Effects of Retinoic Acid on Dominant bemimelia Expression in Mice', BIRTH DEFECTS RESEARCH PART A-CLINICAL AND MOLECULAR TERATOLOGY, vol. 85, no. 1, pp. 36-41.
View/Download from: Publisher's site
View description>>
BACKGROUND: Dominant hemimelia (Dh) is an autosomal dominant mutation that arose spontaneously in mice. Dh animals are asplenic and they exhibit asymmetric hindlimb defects in association with reduced numbers of lumbar vertebrae. These defects suggest that Dh acts early in embryonic development to affect patterning of the anterior-posterior (A-P) and left-right axes. This study was undertaken to determine whether retinoic acid (RA), which is involved in A-P patterning and coordination of bilaterally synchronized somitogenesis, affects phenotypic expression of the Dh gene. METHODS: Thirty-four pregnant females were given, by oral intubation, a single dose of 50 or 75 mg all-trans RA per kilogram body weight at GD 9, 10, or 11. The pregnant females were then euthanized at GD 18 and fetuses removed by cesarean section. A total of 326 fetuses were identified by phenotype and linked DNA and their skeletons were analyzed. RESULTS: There was a differential effect of RA on the axial skeleton and hindlimb of Dh/+ mice as compared to their wild-type littermates. Dose- and stage-specific effects on sternebrae and vertebrae were observed. CONCLUSIONS: The effects of RA dosing on numbers of sternebrae and vertebrae suggest that Dh embryos have a primary defect in retinoid-mediated A-P patterning. Dosing with RA may produce the observed effects on phenotypic expression of Dh/+ by indirectly or directly modifying an already existing altered Hox expression pattern. As the relationship between axial patterning and the asymmetric limb is unknown, Dh is an important model for studying this relationship. © 2008 Wiley-Liss, Inc.
Pham, TD, Brandl, M & Beck, D 2009, 'Fuzzy declustering-based vector quantization', Pattern Recognition, vol. 42, no. 11, pp. 2570-2577.
View/Download from: Publisher's site
View description>>
Vector quantization is a useful approach for multi-dimensional data compression and pattern classification. One of the most popular techniques for vector quantization design is the LBG (Linde, Buzo, Gray) algorithm. To address the problem of producing poor estimate of vector centroids which are subjected to biased data in vector quantization; we propose a fuzzy declustering strategy for the LBG algorithm. The proposed technique calculates appropriate declustering weights to adjust the global data distribution. Using the result of fuzzy declustering-based vector quantization design, we incorporate the notion of fuzzy partition entropy into the distortion measures that can be useful for classification of spectral features. Experimental results obtained from simulated and real data sets demonstrate the effective performance of the proposed approach. © 2009 Elsevier Ltd. All rights reserved.
Qin, Y, Zhang, S, Zhu, X, Zhang, J & Zhang, C 2009, 'Estimating confidence intervals for structural differences between contrast groups with missing data', Expert Systems with Applications, vol. 36, no. 3, pp. 6431-6438.
View/Download from: Publisher's site
View description>>
Difference detection is actual and extremely useful for evaluating a new medicine B against a specified disease by comparing to an old medicine A, which has been used to treat the disease for many years. The datasets generated by applying A and B to the disease are called contrast groups and, main differences between the groups are the mean and distribution differences, referred to structural differences in this paper. However, contrast groups are only two samples obtained by limited applications or tests on A and B, and may be with missing values. Therefore, the differences derived from the groups are inevitably uncertain. In this paper, we propose a statistically sound approach for measuring this uncertainty by identifying the confidence intervals of structural differences between contrast groups. This method is designed significantly against most of those applications whose exact data distributions are unknown a priori, and the data may also be with missing values. We apply our approach to UCI datasets to illustrate its power as a new data mining technique for, such as, distinguishing spam from non-spam emails; and the benign breast cancer from the malign one. © 2008 Elsevier Ltd. All rights reserved.
Qin, Y, Zhang, S, Zhu, X, Zhang, J & Zhang, C 2009, 'POP algorithm: Kernel-based imputation to treat missing values in knowledge discovery from databases', Expert Systems with Applications, vol. 36, no. 2, pp. 2794-2804.
View/Download from: Publisher's site
View description>>
To complete missing values a solution is to use correlations between the attributes of the data. The problem is that it is difficult to identify relations within data containing missing values. Accordingly, we develop a kernel-based missing data imputation in this paper. This approach aims at making an optimal inference on statistical parameters: mean, distribution function and quantile after missing data are imputed. And we refer this approach to parameter optimization method (POP algorithm). We experimentally evaluate our approach, and demonstrate that our POP algorithm (random regression imputation) is much better than deterministic regression imputation in efficiency and generating an inference on the above parameters. © 2008 Elsevier Ltd. All rights reserved.
Ruta, D & Gabrys, B 2009, 'A framework for machine learning based on dynamic physical fields', Natural Computing, vol. 8, no. 2, pp. 219-237.
View/Download from: Publisher's site
View description>>
Despite recent successes and advancements in artificial intelligence and machine learning, this domain remains under continuous challenge and guidance from phenomena and processes observed in natural world. Humans remain unsurpassed in their efficiency of dealing and learning from uncertain information coming in a variety of forms, whereas more and more robust learning and optimisation algorithms have their analytical engine built on the basis of some nature-inspired phenomena. Excellence of neural networks and kernel-based learning methods, an emergence of particle-, swarms-, and social behaviour-based optimisation methods are just few of many facts indicating a trend towards greater exploitation of nature inspired models and systems. This work intends to demonstrate how a simple concept of a physical field can be adopted to build a complete framework for supervised and unsupervised learning methodology. An inspiration for artificial learning has been found in the mechanics of physical fields found on both micro and macro scales. Exploiting the analogies between data and charged particles subjected to gravity, electrostatic and gas particle fields, a family of new algorithms has been developed and applied to classification, clustering and data condensation while properties of the field were further used in a unique visualisation of classification and classifier fusion models. The paper covers extensive pictorial examples and visual interpretations of the presented techniques along with some comparative testing over well-known real and artificial datasets. © Springer Science+Business Media B.V. 2007.
Ryan, L 2009, 'Spatial Epidemiology Some Pitfalls and Opportunities', EPIDEMIOLOGY, vol. 20, no. 2, pp. 242-244.
View/Download from: Publisher's site
Sanchez, BN, Budtz-Jorgensen, E & Ryan, LM 2009, 'AN ESTIMATING EQUATIONS APPROACH TO FITTING LATENT EXPOSURE MODELS WITH LONGITUDINAL HEALTH OUTCOMES', ANNALS OF APPLIED STATISTICS, vol. 3, no. 2, pp. 830-856.
View/Download from: Publisher's site
View description>>
The analysis of data arising from environmental health studies which collect a large number of measures of exposure can benefit from using latent variable models to summarize exposure information. However, difficulties with estimation of model parameters may arise since existing fitting procedures for linear latent variable models require correctly specified residual variance structures for unbiased estimation of regression parameters quantifying the association between (latent) exposure and health outcomes. We propose an estimating equations approach for latent exposure models with longitudinal health outcomes which is robust to misspecification of the outcome variance. We show that compared to maximum likelihood, the loss of efficiency of the proposed method is relatively small when the model is correctly specified. The proposed equations formalize the ad-hoc regression on factor scores procedure, and generalize regression calibration. We propose two weighting schemes for the equations, and compare their efficiency. We apply this method to a study of the effects of in-utero lead exposure on child development. © Institute of Mathematical Statistics, 2009.
Sanchez, BN, Houseman, EA & Ryan, LM 2009, 'Residual-Based Diagnostics for Structural Equation Models', BIOMETRICS, vol. 65, no. 1, pp. 104-115.
View/Download from: Publisher's site
Shen, C, Paisitkriangkrai, S & Zhang, J 2009, 'Efficiently Learning a Detection Cascade with Sparse Eigenvectors', IEEE Transactions on Image Processing, vol. 20, no. 1, pp. 22-35.
View/Download from: Publisher's site
View description>>
In this work, we first show that feature selection methods other thanboosting can also be used for training an efficient object detector. Inparticular, we introduce Greedy Sparse Linear Discriminant Analysis (GSLDA)\cite{Moghaddam2007Fast} for its conceptual simplicity and computationalefficiency; and slightly better detection performance is achieved compared with\cite{Viola2004Robust}. Moreover, we propose a new technique, termed BoostedGreedy Sparse Linear Discriminant Analysis (BGSLDA), to efficiently train adetection cascade. BGSLDA exploits the sample re-weighting property of boostingand the class-separability criterion of GSLDA.
Sim, K, Li, J, Gopalkrishnan, V & Liu, G 2009, 'Mining maximal quasi‐bicliques: Novel algorithm and applications in the stock market and protein networks', Statistical Analysis and Data Mining: The ASA Data Science Journal, vol. 2, no. 4, pp. 255-273.
View/Download from: Publisher's site
View description>>
AbstractSeveral real‐world applications require mining of bicliques, as they represent correlated pairs of data clusters. However, the mining quality is adversely affected by missing and noisy data. Moreover, some applications only require strong interactions between data members of the pairs, but bicliques are pairs that display complete interactions. We address these two limitations by proposing maximal quasi‐bicliques. Maximal quasi‐bicliques tolerate erroneous and missing data, and also relax the interactions between the data members of their pairs. Besides, maximal quasi‐bicliques do not suffer from skewed distribution of missing edges that prior quasi‐bicliques have. We develop an algorithm MQBminer, which mines the complete set of maximal quasi‐bicliques from either bipartite or non‐bipartite graphs. We demonstrate the versatility and effectiveness of maximal quasi‐bicliques to discover highly correlated pairs of data in two diverse real‐world datasets. First, we propose to solve a novel financial stocks analysis problem using maximal quasi‐bicliques to co‐cluster stocks and financial ratios. Results show that the stocks in our co‐clusters usually have significant correlations in their price performance. Second, we use maximal quasi‐bicliques on a mining protein network problem and we show that pairs of protein groups mined by maximal quasi‐bicliques are more significant than those mined by maximal bicliques. Copyright © 2009 Wiley Periodicals, Inc., A Wiley Company
Thongkam, J, Xu, G, Zhang, Y & Huang, F 2009, 'Toward breast cancer survivability prediction models through improving training space', Expert Systems with Applications, vol. 36, no. 10, pp. 12200-12209.
View/Download from: Publisher's site
View description>>
Due to the difficulties of outlier and skewed data, the prediction of breast cancer survivability has presented many challenges in the field of data mining and pattern precognition, especially in medical research. To solve these problems, we have proposed a hybrid approach to generating higher quality data sets in the creation of improved breast cancer survival prediction models. This approach comprises two main steps: (1) utilization of an outlier filtering approach based on C-Support Vector Classification (C-SVC) to identify and eliminate outlier instances; and (2) application of an over-sampling approach using over-sampling with replacement to increase the number of instances in the minority class. In order to assess the capability and effectiveness of the proposed approach, several measurement methods including basic performance (e.g., accuracy, sensitivity, and specificity), Area Under the receiver operating characteristic Curve (AUC) and F-measure were utilized. Moreover, a 10-fold cross-validation method was used to reduce the bias and variance of the results of breast cancer survivability prediction models. Results have indicated that the proposed approach leads to improving the performance of breast cancer survivability prediction models by up to 28.34% due to the improved training data space.
Tsai, P, Cao, L, Hintz, T & Jan, T 2009, 'A bi-modal face recognition framework integrating facial expression with facial appearance', Pattern Recognition Letters, vol. 30, no. 12, pp. 1096-1109.
View/Download from: Publisher's site
View description>>
Among many biometric characteristics, the facial biometric is considered to be the least intrusive technology that can be deployed in the real-world visual surveillance environment. However, in facial biometric, little research attention has been paid to facial expression changes. In fact, facial expression changes have often been treated as noise that would degrade the recognition performance. This paper studies an innovative viewpoint: (1) whether facial expression changes, namely facial behavior, can be positively used for face recognition or not? (2) furthermore, can facial behavior be integrated with facial appearance for assisting the extra-personal separation to enhance face recognition performance? We propose a bi-modal face recognition framework which integrates facial expression with facial appearance. Substantial experiments on multiple facial appearance and facial expression data have been conducted. Our experimental results have validated that facial behavior can play a positive role in face recognition and can assist facial appearance in extra-personal separation in multiple modalities for personal identification improvement.
WANG, L, WU, Q, LI, M, GONZÀLEZ, J & GENG, XIN 2009, 'EDITORIAL', International Journal of Pattern Recognition and Artificial Intelligence, vol. 23, no. 07, pp. 1221-1222.
View/Download from: Publisher's site
Wang, L, Wu, Q, Wang, H, Geng, X & Li, M 2009, 'Image/video-based pattern analysis and HCI applications', Pattern Recognition Letters, vol. 30, no. 12, pp. 1047-1047.
View/Download from: Publisher's site
Weuve, J, Korrick, SA, Weisskopf, MA, Ryan, LM, Schwartz, J, Nie, H, Grodstein, F & Hu, H 2009, 'Cumulative Exposure to Lead in Relation to Cognitive Function in Older Women', ENVIRONMENTAL HEALTH PERSPECTIVES, vol. 117, no. 4, pp. 574-580.
View/Download from: Publisher's site
View description>>
Background: Recent data indicate that chronic low-level exposure to lead is associated with accelerated declines in cognition in older age, but this has not been examined in women. Objective: We examined biomarkers of lead exposure in relation to performance on a battery of cognitive tests among older women. Methods: Patella and tibia bone lead - measures of cumulative exposure over many years - and blood lead, a measure of recent exposure, were assessed in 587 women 47-74 years of age. We assessed their cognitive function 5 years later using validated telephone interviews. Results: Mean ± SD lead levels in tibia, patella, and blood were 10.5 ± 9.7 μg/g bone, 12.6 ± 11.6 μg/g bone, and 2.9 ± 1.9 μg/dL, respectively, consistent with community-level exposures. In multivariable-adjusted analyses of all cognitive tests combined, levels of all three lead biomarkers were associated with worse cognitive performance. The association between bone lead and letter fluency score differed dramatically from the other bone lead-cognitive score associations, and exclusion of this particular score from the combined analyses strengthened the associations between bone lead and cognitive performance. Results were statistically significant only for tibia lead: one SD increase in tibia lead corresponded to a 0.051-unit lower standardized summary cognitive score (95% confidence interval: -0.099 to -0.003; p = 0.04), similar to the difference in cognitive scores we observed between women who were 3 years apart in age. Conclusions: These findings suggest that cumulative exposure to lead, even at low levels experienced in community settings, may have adverse consequences for women's cognition in older age.
Yan, X, Zhang, C & Zhang, S 2009, 'Genetic algorithm-based strategy for identifying association rules without specifying actual minimum support', Expert Systems with Applications, vol. 36, no. 2, pp. 3066-3076.
View/Download from: Publisher's site
View description>>
We design a genetic algorithm-based strategy for identifying association rules without specifying actual minimum support. In this approach, an elaborate encoding method is developed, and the relative confidence is used as the fitness function. With genetic algorithm, a global search can be performed and system automation is implemented, because our model does not require the user-specified threshold of minimum support. Furthermore, we expand this strategy to cover quantitative association rule discovery. For efficiency, we design a generalized FP-tree to implement this algorithm. We experimentally evaluate our approach, and demonstrate that our algorithms significantly reduce the computation costs and generate interesting association rules only. © 2008 Elsevier Ltd. All rights reserved.
Yao, S, Li, J & Shi, Z 2009, 'Phosphate Ion Removal from Aqueous Solution Using an Iron Oxide-Coated Fly Ash Adsorbent', Adsorption Science & Technology, vol. 27, no. 6, pp. 603-614.
View/Download from: Publisher's site
Zeng, X, Pei, J, Wang, K & Li, J 2009, 'PADS: a simple yet effective pattern-aware dynamic search method for fast maximal frequent pattern mining', Knowledge and Information Systems, vol. 20, no. 3, pp. 375-391.
View/Download from: Publisher's site
View description>>
While frequent pattern mining is fundamental for many data mining tasks, mining maximal frequent patterns efficiently is important in both theory and applications of frequent pattern mining. The fundamental challenge is how to search a large space of ite
Zhang, H, Zhao, Y, Cao, L, Zhang, C & Bohlscheid, H 2009, 'Customer Activity Sequence Classification for Debt Prevention in Social Security', Journal of Computer Science and Technology, vol. 24, no. 6, pp. 1000-1009.
View/Download from: Publisher's site
View description>>
From a data mining perspective, sequence classification is to build a classifier using frequent sequential patterns. However, mining for a complete set of sequential patterns on a large dataset can be extremely time-consuming and the large number of patterns discovered also makes the pattern selection and classifier building very time-consuming. The fact is that, in sequence classification, it is much more important to discover discriminative patterns than a complete pattern set. In this paper, we propose a novel hierarchical algorithm to build sequential classifiers using discriminative sequential patterns. Firstly, we mine for the sequential patterns which are the most strongly correlated to each target class. In this step, an aggressive strategy is employed to select a small set of sequential patterns. Secondly, pattern pruning and serial coverage test are done on the mined patterns. The patterns that pass the serial test are used to build the sub-classifier at the first level of the final classifier. And thirdly, the training samples that cannot be covered are fed back to the sequential pattern mining stage with updated parameters. This process continues until predefined interestingness measure thresholds are reached, or all samples are covered. The patterns generated in each loop form the sub-classifier at each level of the final classifier. Within this framework, the searching space can be reduced dramatically while a good classification performance is achieved. The proposed algorithm is tested in a real-world business application for debt prevention in social security area. The novel sequence classification algorithm shows the effectiveness and efficiency for predicting debt occurrences based on customer activity sequence data.
Zhang, J, Yang, C, Huang, C, Li, Y, Dai, N, Lu, P, Jiang, Z, Chen, W & Li, J 2009, '10 W CW ytterbium-doped fiber laser with 4 × 1 fused fiber bundle combiner', Frontiers of Optoelectronics in China, vol. 2, no. 1, pp. 61-63.
View/Download from: Publisher's site
Zhang, Y & Xu, G 2009, 'On web communities mining and recommendation', Concurrency and Computation: Practice and Experience, vol. 21, no. 5, pp. 561-582.
View/Download from: Publisher's site
View description>>
AbstractBecause of the lack of a uniform schema for web documents and the sheer amount and dynamics of web data, both the effectiveness and the efficiency of information management and retrieval of web data are often unsatisfactory when using conventional data management and searching techniques. To address this issue, we have adopted web mining and web community analysis approaches. On the basis of the analysis of web document contents, hyperlinks analysis, user access logs and semantic analysis, we have developed various approaches or algorithms to construct and analyze web communities, and to make recommendations. This paper will introduce and discuss several approaches on web community mining and recommendation. Copyright © 2009 John Wiley & Sons, Ltd.
Zhang, Z, Yang, P, Wu, X & Zhang, C 2009, 'An Agent-Based Hybrid System for Microarray Data Analysis', IEEE Intelligent Systems, vol. 24, no. 5, pp. 53-63.
View/Download from: Publisher's site
View description>>
This article reports our experience in agent-based hybrid construction for microarray data analysis. The contributions are twofold: We demonstrate that agent-based approaches are suitable for building hybrid systems in general, and that a genetic ensemble system is appropriate for microarray data analysis in particular. Created using an agent-based framework, this genetic ensemble system for microarray data analysis excels in both sample classification accuracy and gene selection reproducibility.
Beck, D & Wong, STC 1970, 'Conference Scene: Wake-up call for the engineering and biomedical science communities in nanomedicine', Nanomedicine, Future Medicine Ltd, pp. 515-517.
View/Download from: Publisher's site
View description>>
The IEEE-NIH 4th Life Science Systems and Applications Workshop 2009 (LiSSA ’09) was jointly organized by the IEEE LiSSA Technical Committee and the NIH Nano Task Force. It was endorsed by the NIH Biomedical Information Science and Technology Initiative (BISTI) and the National Library of Medicine. The workshop was held in the Natcher Conference Center on the NIH campus in Bethesda, MD, USA. It took place on the 9–10 April, 2009, during the NIH NanoWeek and had approximately 200 delegates from around the globe (including North America, Europe, Asia and Australia) from both engineering and biomedical science disciplines. The conference featured around 40 talks, including nine plenary speakers emphasizing current state-of-the-art nanotechnology practices, developments and industry applications. All talks were scheduled in three oral and seven special sessions, as well as three breakout sessions. In addition, the interactive poster sessions hosted over 30 abstracts and attracted much attention from the audience; these sessions were readily used by many attendees to connect with colleagues of similar interest. In this article, we provide some of the highlights from the workshop.
Beck, D, Zhou, X, Pham, T, Sabatini, B & Wong, STC 1970, 'An image driven systems biology approach for neurodegenerative disease studies in the TSC-mTOR pathway', 2009 IEEE/NIH Life Science Systems and Applications Workshop, 2009 IEEE/NIH Life Science Systems and Applications Workshop (LiSSA) Formerly known as LSSA and, IEEE, pp. 36-39.
View/Download from: Publisher's site
View description>>
In this brief paper we present an overview of the TSC-mTOR pathway and its importance in neurodegenerative disease (ND). We illustrate the influence of ND on dendritic spine morphology. Then we discuss some details of functional gene networks (FGN) and use this information to propose an image driven systems biology approach for the construction of a FGN for ND. We conclude on its importance and the prospective outcome of our study. © 2009 IEEE.
Benter, A, Xu, R, Moore, W, Antolovich, M & Gao, J 1970, 'Fragment size detection within homogeneous material using ground penetrating radar', 2009 International Radar Conference 'Surveillance for a Safer World', RADAR 2009.
View description>>
Ground Penetrating Radar (GPR) offers the ability to observe the internal structure of a pile of rocks. Large fragments within the pile may not be visible on the surface. Determining these large fragment sizes before collection can improve mine productivity. This research has examined the potential to identify objects where the background media and the object exhibit the same dielectric properties. Preliminary results are presented which show identification is possible using standard GPR equipment.
Brodka, P, Musial, K & Kazienko, P 1970, 'A Performance of Centrality Calculation in Social Networks', 2009 International Conference on Computational Aspects of Social Networks, 2009 International Conference on Computational Aspects of Social Networks (CASON), IEEE, ESIGETEL, Fontainebleau, FRANCE, pp. 24-31.
View/Download from: Publisher's site
Brodka, P, Musial, K & Kazienko, P 1970, 'Efficiency of Node Position Calculation in Social Networks', KNOWLEDGE-BASED AND INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT II, PROCEEDINGS, 13th International Conference on Knowledge-Based Intelligent Information and Engineering Systems, Springer Berlin Heidelberg, Univ Chile, Fac Phys Sci & Math, Santiago, CHILE, pp. 455-463.
View/Download from: Publisher's site
Cao, L 1970, 'Data Mining in Financial Markets', Advanced Data Mining and Applications, 5th International Conference, ADMA 2009, International Conference on Advanced Data Mining and Applications, Springer Berlin Heidelberg, Budapest, Hungary, pp. 4-4.
View/Download from: Publisher's site
Cao, L, Luo, D & Zhang, C 1970, 'Ubiquitous Intelligence in Agent Mining', ADMI 2009, International Workshop on Agents and Data Mining Interaction, Springer Berlin Heidelberg, Budapest, Hungary, pp. 23-35.
View/Download from: Publisher's site
View description>>
Agent mining, namely the interaction and integration of multi-agent and data mining, has emerged as a very promising research area. While many mutual issues exist in both multi-agent and data mining areas, most of them can be described in terms of or related to ubiquitous intelligence. It is certainly very important to define, specify, represent, analyze and utilize ubiquitous intelligence in agents, data mining, and agent mining. This paper presents a novel but preliminary investigation of ubiquitous intelligence in these areas. We specify five types of ubiquitous intelligence: data intelligence, human intelligence, domain intelligence, network and web intelligence, organizational intelligence, and social intelligence. We define and illustrate them, and discuss techniques for involving them into agents, data mining, and agent mining for complex problem-solving. Further investigation on involving and synthesizing ubiquitous intelligence into agents, data mining, and agent mining will lead to a disciplinary upgrade from methodological, technical and practical perspectives.
Chen, L & Bhowmick, SS 1970, 'In the Search of NECTARs from Evolutionary Trees', DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, PROCEEDINGS, 14th International Conference on Database Systems for Advanced Applications, SPRINGER-VERLAG BERLIN, Brisbane, AUSTRALIA, pp. 714-+.
Chen, P & Li, J 1970, 'Prediction of protein long-range contacts using GaMC approach with sequence profile centers', 2009 IEEE International Conference on Bioinformatics and Biomedicine Workshop, 2009 IEEE International Conference on Bioinformatics and Biomedicine Workshop, BIBMW, IEEE, pp. 128-135.
View/Download from: Publisher's site
View description>>
In this paper, we apply an evolutionary optimization classifier, referred to as genetic algorithm-based multiple classifier (GaMC), to the long-range contacts prediction. As a result, about 44.1% contacts between long-range residues (with a sequence separation of at least 24 amino acids) are founded around the sequence profile (SP) centre when evaluating the top L/5 (L is the sequence length of protein) classified contacts if the SP centers are known. Meanwhile, with the knowledge of sequence profile center and the GaMC method, about 20.42% long-range contacts are correctly predicted. Results showed that SP center may be a sound pathway to predict contact map in protein structures. ©2009 IEEE.
Eastwood, M & Gabrys, B 1970, 'A Non-sequential Representation of Sequential Data for Churn Prediction', KNOWLEDGE-BASED AND INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT I, PROCEEDINGS, 13th International Conference on Knowledge-Based Intelligent Information and Engineering Systems, Springer Berlin Heidelberg, Univ Chile, Fac Phys Sci & Math, Santiago, CHILE, pp. 209-218.
View/Download from: Publisher's site
Gabrys, B 1970, 'Learning with Missing or Incomplete Data', IMAGE ANALYSIS AND PROCESSING - ICIAP 2009, PROCEEDINGS, 15th International Conference on Image Analysis and Processing (ICIAP 2009), Springer Berlin Heidelberg, Vietri sul Mare, ITALY, pp. 1-4.
View/Download from: Publisher's site
Homayounfard, H & Kennedy, PJ 1970, 'HDAX: Historical symbolic modelling of delay time series in a communications network', Conferences in Research and Practice in Information Technology Series, Australian Data Mining Conference, Australian Computer Society, Melbourne, Australia, pp. 129-137.
View description>>
There are certain performance parameters like packet delay, delay variation (jitter) and loss, which are decision factors for online quality of service (QoS) traffic routing. Although considerable efforts have been placed on the Internet to assure QoS, the dominant TCP/IP - like the best-effort communications policy - does not provide sufficient guarantee without abrupt change in the protocols. Estimation and forecasting end-to-end delay and its variations are essential tasks in network routing management for detecting anomalies. A large amount of research has been done to provide foreknowledge of network anomalies by characterizing and forecasting delay with numerical forecasting methods. However, the methods are time consuming and not efficient for real-time application when dealing with large online datasets. Application is more difficult when the data is missing or not available during online forecasting. Moreover, the time cost in statistical methods for trivial forecasting accuracy is prohibitive. Consequently, many researchers suggest a transition from computing with numbers to the manipulation of perceptions in the form of fuzzy linguistic variables. The current work addresses the issue of defining a delay approximation model for packet switching in communications networks. In particular, we focus on decision-making for smart routing management, which is based on the knowledge provided by data mining (informed) agents. We propose a historical symbolic delay approximation model (HDAX) for delay forecasting. Preliminary experiments with the model show good accuracy in forecasting the delay time-series as well as a reduction in the time cost of the forecasting method. HDAX compares favourably with the competing Autoregressive Moving Average (ARMA) algorithm in terms of execution time and accuracy. © 2009, Australian Computer Society, Inc.
Juszczyszyn, K & Musiał, K 1970, 'Structural Changes in an Email-Based Social Network', AGENT AND MULTI-AGENT SYSTEMS: TECHNOLOGIES AND APPLICATIONS, PROCEEDINGS, 3rd KES International Symposium on Agent and Multi-Agent Systems, Springer Berlin Heidelberg, Uppsala Univ, Uppsala, SWEDEN, pp. 40-49.
View/Download from: Publisher's site
Juszczyszyn, K, Musial, A, Musial, K & Brodka, P 1970, 'Molecular dynamics modelling of the temporal changes in complex networks', 2009 IEEE Congress on Evolutionary Computation, 2009 IEEE Congress on Evolutionary Computation (CEC), IEEE, Trondheim, NORWAY, pp. 553-+.
View/Download from: Publisher's site
Kadlec, P & Gabrys, B 1970, 'Evolving on-line prediction model dealing with industrial data sets', 2009 IEEE Workshop on Evolving and Self-Developing Intelligent Systems, 2009 IEEE Workshop on Evolving and Self-Developing Intelligent Systems (ESDIS), IEEE, Nashville, TN, pp. 24-31.
View/Download from: Publisher's site
Kadlec, P & Gabrys, B 1970, 'Soft Sensor Based on Adaptive Local Learning', Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Springer Berlin Heidelberg, pp. 1172-1179.
View/Download from: Publisher's site
View description>>
When it comes to application of computational learning techniques in practical scenarios, like for example adaptive inferential control, it is often difficult to apply the state-of-the-art techniques in a straight forward manner and usually some effort has to be dedicated to tuning either the data, in a form of data pre-processing, or the modelling techniques, in form of optimal parameter search or modification of the training algorithm. In this work we present a robust approach to on-line predictive modelling which is focusing on dealing with challenges like noisy data, data outliers and in particular drifting data which are often present in industrial data sets. The approach is based on the local learning approach, where models of limited complexity focus on partitions of the input space and on an ensemble building technique which combines the predictions of the particular local models into the final predicted value. Furthermore, the technique provides the means for on-line adaptation and can thus be deployed in a dynamic environment which is demonstrated in this work in terms of an application of the presented approach to a raw industrial data set exhibiting drifting data, outliers, missing values and measurement noise. © 2009 Springer Berlin Heidelberg.
Kadlec, P & Gabrys, B 1970, 'Soft sensors: where are we and what are the current and future challenges?', IFAC Proceedings Volumes, Elsevier BV, pp. 572-577.
View/Download from: Publisher's site
View description>>
In this work we present a summary of the review on data-driven soft sensors published in [1] together with a proposal of how to deal with the identified issues and challenges. We discuss the most common approaches for the development of soft sensors followed by a critical analysis of the main issues in the current soft sensor development. Currently, these are the time which has to be spent on the model development including data pre-processing and model building together with the effort which needs to be spent on periodical performance assessment and re-training of the model. Based on the identified problems we propose a solution based on a model development architecture which can accommodate different data preprocessing techniques, predictive modelling methods as well as approaches for model adaptation. The architecture is based on a structure which unifies several concepts from machine learning such as ensemble methods, local learning, meta learning and concept drift handling. Using the above mechanisms it provides means for automated data pre-processing, model validation, selection and adaptation which can be used to significantly simplify the soft sensor building and maintenance process. Copyright © 2007 International Federation of Automatic Control.
Kazienko, P, Musiał, K & Zgrzywa, A 1970, 'Evaluation of node position based on email communication', Control and Cybernetics, Conference on Data Processing Technologies, POLISH ACAD SCIENCES SYSTEMS RESEARCH INST, Poznan, POLAND, pp. 67-86.
View description>>
Rapid development of various kinds of social networks within the Internet enabled investigation of their properties and analyzing their structure. An interesting scientific problem in this domain is the assessment of the node position within the directed, weighted graph that represents the social network of email users. The new method of node position analysis, which takes into account both the node positions of the neighbors and the strength of connections between network nodes, is presented in the paper. The node position can be used to discover key network users, who are the most important in the population and who have potentially the greatest influence on others. The experiments carried out on two datasets enabled studying the main properties of the new measure.
Kennedy, PJ, Ong, K & Christen, P 1970, 'Data Mining and Analytics', Data Mining and Analytics 2009 (AusDM'09), Australian Data Mining Conference, Australian Computer Society, Melbourne, Australia, pp. 1-218.
Kennedy, PJ, Ong, KL & Christen, P 1970, 'Preface', Conferences in Research and Practice in Information Technology Series.
Kusakunniran, W, Li, H & Zhang, J 1970, 'A Direct Method to Self-Calibrate a Surveillance Camera by Observing a Walking Pedestrian', 2009 Digital Image Computing: Techniques and Applications, 2009 Digital Image Computing: Techniques and Applications, IEEE, Melbourne, VIC, pp. 250-255.
View/Download from: Publisher's site
View description>>
Recent efforts show that it is possible to calibrate a surveillance camera simply from observing a walking human. This procedure can be seen as a special application of the camera self-calibration technique. Several methods have been proposed along this
Kusakunniran, W, Wu, Q, Li, H & Zhang, J 1970, 'Automatic Gait Recognition Using Weighted Binary Pattern on Video', 2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance, 2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), IEEE, Genoa, Italy, pp. 49-54.
View/Download from: Publisher's site
View description>>
Human identification by recognizing the spontaneous gait recorded in real-world setting is a tough and not yet fully resolved problem in biometrics research. Several issues have contributed to the difficulties of this task. They include various poses, different clothes, moderate to large changes of normal walking manner due to carrying diverse goods when walking, and the uncertainty of the environments where the people are walking. In order to achieve a better gait recognition, this paper proposes a new method based on Weighted Binary Pattern (WBP). WBP first constructs binary pattern from a sequence of aligned silhouettes. Then, adaptive weighting technique is applied to discriminate significances of the bits in gait signatures. Being compared with most of existing methods in the literatures, this method can better deal with gait frequency, local spatial-temporal human pose features, and global body shape statistics. The proposed method is validated on several well known benchmark databases. The extensive and encouraging experimental results show that the proposed algorithm achieves high accuracy, but with low complexity and computational time. © 2009 IEEE.
Kusakunniran, W, Wu, Q, Li, H & Zhang, J 1970, 'Multiple views gait recognition using View Transformation Model based on optimized Gait Energy Image', 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, IEEE, Kyoto, Japan, pp. 1058-1064.
View/Download from: Publisher's site
View description>>
Gait is one of well recognized biometrics that has been widely used for human identification. However, the current gait recognition might have difficulties due to viewing angle being changed. This is because the viewing angle under which the gait signature database was generated may not be the same as the viewing angle when the probe data are obtained. This paper proposes a new multi-view gait recognition approach which tackles the problems mentioned above. Being different from other approaches of same category, this new method creates a so called View Transformation Model (VTM) based on spatial-domain Gait Energy Image (GEI) by adopting Singular Value Decomposition (SVD) technique. To further improve the performance of the proposed VTM, Linear Discriminant Analysis (LDA) is used to optimize the obtained GEI feature vectors. When implementing SVD there are a few practical problems such as large matrix size and over-fitting. In this paper, reduced SVD is introduced to alleviate the effects caused by these problems. Using the generated VTM, the viewing angles of gallery gait data and probe gait data can be transformed into the same direction. Thus, gait signatures can be measured without difficulties. The extensive experiments show that the proposed algorithm can significantly improve the multiple view gait recognition performance when being compared to the similar methods in literature. ©2009 IEEE.
Lemke, C, Riedel, S & Gabrys, B 1970, 'Dynamic combination of forecasts generated by diversification procedures applied to forecasting of airline cancellations', 2009 IEEE Symposium on Computational Intelligence for Financial Engineering, 2009 IEEE Symposium on Computational Intelligence for Financial Engineering (CIFEr), IEEE, pp. 85-91.
View/Download from: Publisher's site
View description>>
The combination of forecasts is a well established procedure for improving forecast performance and decreasing the risk of selecting an inferior model out of an existing pool of models. Work in this area mainly focuses on combining several functionally different models, but some publications also deal with combining forecasts with the same functional approach. In the latter case, individual forecasts are generated by diversifying one or more model parameters or, if dealing with hierarchical data, by using forecasts from different levels. This work looks at multi-dimensional data from airline industry, with the aim of improving the forecast of cancellation rates for bookings. Three different methods are employed for the generation of individual forecasts. Forecast combinations are usually implemented in a more or less static structure, either including all available forecasts or trimming a fixed percentage of the worst performing models. For a big number of individual forecasts, this procedure can become inefficient. In this paper, a dynamic approach of pooling and trimming is applied to the generated forecasts for airline cancellation data. © 2009 IEEE.
Li, L, Xu, G, Zhang, Y & Kitsuregawa, M 1970, 'Enhancing Web Search by Aggregating Results of Related Web Queries', Web Information Systems Engineering - WISE 2009 Lecture Notes in Computer Science, Web Information Systems Engineering - WISE 2009, Springer Berlin Heidelberg, Poznan, Poland, pp. 203-217.
View/Download from: Publisher's site
View description>>
Currently, commercial search engines have implemented methods to suggest alternative Web queries to users, which helps them specify alternative related queries in pursuit of finding needed Web pages. In this paper, we address the Web search problem on related queries to improve retrieval quality by devising a novel search rank aggregation mechanism. Given an initial query and the suggested related queries, our search system concurrently processes their search result lists from an existing search engine and then forms a single list aggregated by all the retrieved lists. In particular we propose a generic rank aggregation framework which considers not only the number of wins that an item won in a competition, but also the quality of its competitor items in calculat- ing the ranking of Web items. The framework combines the traditional and random walk based rank aggregation methods to produce a more reasonable list to users. Experimental results show that the proposed approach can clearly improve the retrieval quality in a parallel man- ner over the traditional search strategy that serially returns result lists. Moreover, we also empirically investigate how different rank aggregation methods affect the retrieval performance.
Merigo, JM 1970, 'On the Use of the OWA Operator in the Weighted Average and its Application in Decision Making', WORLD CONGRESS ON ENGINEERING 2009, VOLS I AND II, World Congress on Engineering 2009, INT ASSOC ENGINEERS-IAENG, Imperial Coll London, London, ENGLAND, pp. 82-87.
MERIGÓ, JM 1970, 'INDUCED GENERALIZED AGGREGATION OPERATORS IN THE WEIGHTED AVERAGE', Intelligent Decision Making Systems, Proceedings of the 4th International ISKE Conference on Intelligent Systems and Knowledge Engineering, WORLD SCIENTIFIC, Hasselt Univ, Hasselt, BELGIUM, pp. 625-+.
View/Download from: Publisher's site
MERIGÓ, JM 1970, 'ON THE UNIFICATION BETWEEN THE PROBABILITY, THE WEIGHTED AVERAGE AND THE OWA OPERATOR', Intelligent Decision Making Systems, Proceedings of the 4th International ISKE Conference on Intelligent Systems and Knowledge Engineering, WORLD SCIENTIFIC, Hasselt Univ, Hasselt, BELGIUM, pp. 375-+.
View/Download from: Publisher's site
Merigó, JM 1970, 'Probabilistic decision making with the OWA operator and its application in investment management', 2009 International Fuzzy Systems Association World Congress and 2009 European Society for Fuzzy Logic and Technology Conference, IFSA-EUSFLAT 2009 - Proceedings, Joint World Congress of International-Fuzzy-Systems-Association (IFSA)/European Conference of European-Society-for-Fuzzy-Logic-and-Technology (EUSFLAT), EUROPEAN SOC FUZZY LOGIC & TECHNOLOGY, Lisbon, PORTUGAL, pp. 1364-1369.
View description>>
We develop a new model for decision making under risk environment and under uncertainty. We introduce a new aggregation operator that unifies the probabilities and the ordered weighted averaging (OWA) operator in the same formulation. We call it the probabilistic ordered weighted averaging (POWA) operator. This aggregation operator provides a more complete representation of the decision problem because it is able to consider probabilistic information and the attitudinal character of the decision maker. We study different properties and families of the POWA operator. We also develop an illustrative example of the new approach in a decision making problem about selection of investments.
Merigó, JM 1970, 'The Fuzzy Probabilistic Weighted Averaging Operator and its Application in Decision Making', 2009 Ninth International Conference on Intelligent Systems Design and Applications, 2009 Ninth International Conference on Intelligent Systems Design and Applications, IEEE, Univ Pisa, Pisa, ITALY, pp. 485-490.
View/Download from: Publisher's site
MERIGÓ, JM & CASANOVAS, M 1970, 'USING DISTANCE MEASURES IN HEAVY AGGREGATION OPERATORS', Intelligent Decision Making Systems, Proceedings of the 4th International ISKE Conference on Intelligent Systems and Knowledge Engineering, WORLD SCIENTIFIC, Hasselt Univ, Hasselt, BELGIUM, pp. 589-594.
View/Download from: Publisher's site
Merigo, JM & Gil-Lafuente, AM 1970, 'An Overview of Fuzzy Research in the ISI Web of Knowledge', WORLD CONGRESS ON ENGINEERING 2009, VOLS I AND II, World Congress on Engineering 2009, INT ASSOC ENGINEERS-IAENG, Imperial Coll London, London, ENGLAND, pp. 43-48.
Merigó, JM & Gil-Lafuente, AM 1970, 'Some Basic Results of Fuzzy Research in the ISI Web of Knowledge', 2009 Ninth International Conference on Intelligent Systems Design and Applications, 2009 Ninth International Conference on Intelligent Systems Design and Applications, IEEE, Univ Pisa, Pisa, ITALY, pp. 1215-1220.
View/Download from: Publisher's site
Merigó, JM & Gil-Lafuente, AM 1970, 'The fuzzy induced generalized OWA operator and its application in business decision making', 2009 International Fuzzy Systems Association World Congress and 2009 European Society for Fuzzy Logic and Technology Conference, IFSA-EUSFLAT 2009 - Proceedings, Joint World Congress of International-Fuzzy-Systems-Association (IFSA)/European Conference of European-Society-for-Fuzzy-Logic-and-Technology (EUSFLAT), EUROPEAN SOC FUZZY LOGIC & TECHNOLOGY, Lisbon, PORTUGAL, pp. 1661-1666.
View description>>
We present the fuzzy induced generalized OWA (FIGOWA) operator. It is an aggregation operator that uses the main characteristics of the fuzzy OWA (FOWA) operator, the induced OWA (IOWA) operator and the generalized OWA (GOWA) operator. Therefore, it uses uncertain information represented in the form of fuzzy numbers, generalized means and order inducing variables. The main advantage of this operator is that it includes a wide range of mean operators in the same formulation such as the FOWA, the IOWA, the GOWA, the induced GOWA, the fuzzy IOWA, the fuzzy generalized mean, etc. We study some of its main properties. A further generalization by using quasi-arithmetic means is also presented. This operator is called Quasi-FIOWA operator. We also develop an application of the new approach in a strategic decision making problem.
Merigo, JM, Gil-Lafuente, AM & Gil-Aluja, J 1970, 'Induced Aggregation Operators in the Generalized Adequacy Coefficient', WORLD CONGRESS ON ENGINEERING 2009, VOLS I AND II, World Congress on Engineering 2009, INT ASSOC ENGINEERS-IAENG, Imperial Coll London, London, ENGLAND, pp. 7-11.
Merigó, JM, Gil-Lafuente, AM & Martorell, O 1970, 'On the Use of the Uncertain Induced OWA Operator and the Uncertain Weighted Average and its Application in Tourism Management', 2009 Ninth International Conference on Intelligent Systems Design and Applications, 2009 Ninth International Conference on Intelligent Systems Design and Applications, IEEE, Univ Pisa, Pisa, ITALY, pp. 856-+.
View/Download from: Publisher's site
Moghaddam, Z & Piccardi, M 1970, 'Deterministic Initialization of Hidden Markov Models for Human Action Recognition', 2009 Digital Image Computing: Techniques and Applications, 2009 Digital Image Computing: Techniques and Applications, IEEE, Melbourne, Australia, pp. 188-195.
View/Download from: Publisher's site
View description>>
Human action recognition is often approached in terms of probabilistic models such as the hidden Markov model or other graphical models. When learning such models by way of Expectation- Maximisation algorithms, arbitrary choices must be made for their initial parameters. Often, solutions for the selection of the initial parameters are based on random functions. However, in this paper, we argue that deterministic alternatives are preferable, and propose various methods. Experiments on a video dataset prove that the deterministic initialization is capable of achieving an accuracy that is comparable to or above the average from random initializations and suffers from no deviation thanks to its deterministic nature. The methods proposed naturally extend to be used with other graphical models such as dynamic Bayesian networks and conditional random fields.
Musial, K & Juszczyszyn, K 1970, 'Motif-Based Analysis of Social Position Influence on Interconnection Patterns in Complex Social Network', 2009 First Asian Conference on Intelligent Information and Database Systems, 2009 First Asian Conference on Intelligent Information and Database Systems, ACIIDS, IEEE, Dong Hoi, VIETNAM, pp. 34-39.
View/Download from: Publisher's site
Musiał, K & Juszczyszyn, K 1970, 'Properties of Bridge Nodes in Social Networks', COMPUTATIONAL COLLECTIVE INTELLIGENCE: SEMANTIC WEB, SOCIAL NETWORKS AND MULTIAGENT SYSTEMS, 1st International Conference on Computational Collective Intelligence, Springer Berlin Heidelberg, Wroclaw, POLAND, pp. 357-364.
View/Download from: Publisher's site
Musial, K, Juszczyszyn, K, Gabrys, B & Kazienko, P 1970, 'Patterns of Interactions in Complex Social Networks Based on Coloured Motifs Analysis', ADVANCES IN NEURO-INFORMATION PROCESSING, PT II, 15th International Conference on Neuro-Information Processing, Springer Berlin Heidelberg, Auckland, NEW ZEALAND, pp. 607-614.
View/Download from: Publisher's site
Musiał, K, Kazienko, P & Bródka, P 1970, 'User position measures in social networks', Proceedings of the 3rd Workshop on Social Network Mining and Analysis, KDD09: The 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM.
View/Download from: Publisher's site
View description>>
Network analysis offers many centrality measures that are successfully utilized in the process of investigating the social network profile. The most important and representative measures are presented in the paper. It includes indegree centrality, proximity prestige, rank prestige, node position, outdegree centrality, eccentrality, closeness centrality, and betweenes centrality. Both feature analysis and experimental comparative studies revealed the general profile of selected measures.
Otoom, AF, Concha, OP, Gunes, H & Piccardi, M 1970, 'Mixtures of Normalized Linear Projections', ADVANCED CONCEPTS FOR INTELLIGENT VISION SYSTEMS, PROCEEDINGS, Advanced Concepts for Intelligent Vision Systems, Springer, Bordeaux, France, pp. 66-76.
View/Download from: Publisher's site
View description>>
High dimensional spaces pose a challenge to any classification task. In fact, these spaces contain much redundancy and it becomes crucial to reduce the dimensionality of the data to improve analysis, density modeling, and classification. In this paper, we present a method for dimensionality reduction in mixture models and its use in classification. For each component of the mixture, the data are projected by a linear transformation onto a lower-dimensional space. Subsequently, the projection matrices and the densities in such compressed spaces are learned by means of an Expectation Maximization (EM) algorithm. However, two main issues arise as a result of implementing this approach, namely: 1) the scale of the densities can be different across the mixture components and 2) a singularity problem may occur. We suggest solutions to these problems and validate the proposed method on three image data sets from the UCI Machine Learning Repository. The classification performance is compared with that of a mixture of probabilistic principal component analysers (MPPCA). Across the three data sets, our accuracy always compares favourably, with improvements ranging from 2.5% to 35.4%. © 2009 Springer Berlin Heidelberg.
Paisitkriangkrai, S, Chunhua Shen & Jian Zhang 1970, 'Efficiently training a better visual detector with sparse eigenvectors', 2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, Miami, FL, pp. 1129-1136.
View/Download from: Publisher's site
View description>>
Face detection plays an important role in many vision applications. Since Viola and Jones [1] proposed the first real-time AdaBoost based object detection system, much ef- fort has been spent on improving the boosting method. In this work, we first show
Paisitkriangkrai, S, Chunhua Shen & Zhang, J 1970, 'Efficiently training a better visual detector with sparse eigenvectors', 2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPR Workshops), IEEE.
View/Download from: Publisher's site
Richard Yi Da Xu & Kemp, M 1970, 'Multiple curvature based approach to human upper body parts detection with connected ellipse model fine-tuning', 2009 16th IEEE International Conference on Image Processing (ICIP), 2009 16th IEEE International Conference on Image Processing ICIP 2009, IEEE, pp. 2577-2580.
View/Download from: Publisher's site
View description>>
In this paper, we discuss an effective method for detecting human upper body parts from a 2D image silhouette using curvature analysis and ellipse fitting. First we smooth the silhouette so that we can determine just the global features: the head, hands and armpits. Next we reduce the smoothing to detect the local features of the neck and elbows. We model the human upper body by multiple connected ellipses. Thus we segment the body by the extracted features. Ellipses are fitted to each segment. Lastly, we apply a non-linear least square method to minimize the differences between the connected ellipse model and the edge of the silhouette. ©2009 IEEE.
Ruo Du, Qiang Wu, Xiangjian He, Wenjing Jia & Daming Wei 1970, 'Facial expression recognition using histogram variances faces', 2009 Workshop on Applications of Computer Vision (WACV), 2009 Workshop on Applications of Computer Vision (WACV), IEEE, Snowbird, USA, pp. 341-347.
View/Download from: Publisher's site
View description>>
In human's expression recognition, the representation of expression features is essential for the recognition accuracy. In this work we propose a novel approach for extracting expression dynamic features from facial expression videos. Rather than utilising statistical models e.g. Hidden Markov Model (HMM), our approach integrates expression dynamic features into a static image, the Histogram Variances Face (HVF), by fusing histogram variances among the frames in a video. The HVFs can be automatically obtained from videos with different frame rates and immune to illumination interference. In our experiments, for the videos picturing the same facial expression, e.g., surprise, happy and sadness etc., their corresponding HVFs are similar, even though the performers and frame rates are different. Therefore the static facial recognition approaches can be utilised for the dynamic expression recognition. We have applied this approach on the well-known Cohn-Kanade AU-Coded Facial Expression database then classified HVFs using PCA and Support Vector Machine (SVMs), and found the accuracy of HVFs classification is very encouraging. © 2009 IEEE.
Smith, D, Hanlen, L, Zhang, JA, Miniutti, D, Rodda, D & Gilbert, B 2009, 'Characterization of the Dynamic Narrowband On-Body to Off-Body Area Channel', 2009 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, VOLS 1-8, IEEE International Conference on Communications (ICC 2009), IEEE, Dresden, GERMANY, pp. 4207-+.
Tang, M, Zhou, Y, Cui, P, Wang, W, Li, J, Zhang, H, Hou, Y & Yan, B 1970, 'Discovery of Migration Habitats and Routes of Wild Bird Species by Clustering and Association Analysis', ADVANCED DATA MINING AND APPLICATIONS, PROCEEDINGS, 5th International Conference on Advanced Data Mining and Applications, Springer Berlin Heidelberg, Beijing, PEOPLES R CHINA, pp. 288-301.
View/Download from: Publisher's site
Thi, TH, Lu, S, Zhang, J, Cheng, L & Wang, L 1970, 'Human Body Articulation for Action Recognition in Video Sequences', 2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance, 2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), IEEE, Genova, pp. 92-97.
View/Download from: Publisher's site
View description>>
This paper presents a new technique for action recognition in video using human body part-based approach, combining both local feature description of each body part, and global graphical model structure of the human action. The human body is divided into elementary points from which a Decomposable Triangulated Graph will be built. The temporal variation of human activity is encoded in the velocity distribution of each node in the graph, while the graph structure shows the spatial configuration of all the nodes in the action. Tracking trajectories of unlabeled good feature points are correctly labeled using Maximum a Posterior probability. Dynamic Programming is then implemented to boost up the exhaustive search for the optimal labeling of unknown body parts and the best possible action. A simple and efficient technique for building the optimal structure of the human action graph is also implemented. Experimental results on the KTH dataset proves the success and potential applications of this proposed technique. © 2009 IEEE.
Tsai, P, Tran, TP & Cao, L 1970, 'Expression-invariant facial identification', 2009 IEEE International Conference on Systems, Man and Cybernetics, 2009 IEEE International Conference on Systems, Man and Cybernetics - SMC, IEEE, San Antonio, Texas, USA, pp. 5151-5155.
View/Download from: Publisher's site
View description>>
Facial identification has been recognized as most simple and non-intrusive technology that can be applied in many places. However, there are still many unsolved facial identification problems due to different intra-personal variations. In particular, when images of the databases appear at different facial expressions, most currently available facial recognition approaches encounter the expression-invariant problem in which neutral faces are difficult to be recognized. In this paper, a new approach is proposed to transform facial expressions to neutral-face like images; hence enabling image retrieval systems to robustly identify a persons face for which its learning and testing face images differ in facial expression.
Vezzani, R, Piccardi, M & Cucchiara, R 1970, 'An efficient Bayesian framework for on-line action recognition', 2009 16th IEEE International Conference on Image Processing (ICIP), 2009 16th IEEE International Conference on Image Processing ICIP 2009, IEEE, Cairo, Egypt, pp. 3553-3556.
View/Download from: Publisher's site
View description>>
On-line action recognition from a continuous stream of actions is still an open problem with fewer solutions proposed compared to time-segmented action recognition. The most challenging task is to classify the current action while finding its time boundaries at the same time. In this paper we propose an approach capable of performing on-line action segmentation and recognition by means of batteries of HMM taking into account all the possible time boundaries and action classes. A suitable Bayesian normalization is applied to make observation sequences of different length comparable and computational optimizations are introduce to achieve real-time performances. Results on a well known action dataset prove the efficacy of the proposed method.
Wang, W, Shen, C, Zhang, J & Paisitkriangkrai, S 1970, 'A Two-Layer Night-Time Vehicle Detector', 2009 Digital Image Computing: Techniques and Applications, 2009 Digital Image Computing: Techniques and Applications, IEEE, Melbourne, VIC, pp. 162-167.
View/Download from: Publisher's site
View description>>
We present a two-layer night time vehicle detector in this work. At the first layer, vehicle headlight detection [1, 2, 3] is applied to find areas (bounding boxes) where the possible pairs of headlights locate in the image, the Haar feature based AdaBoo
Xiangjian He, Jianmin Li, Daming Wei, Wenjing Jia & Qiang Wu 1970, 'Canny edge detection on a virtual hexagonal image structure', 2009 Joint Conferences on Pervasive Computing (JCPC), 2009 Joint Conferences on Pervasive Computing (JCPC), IEEE, Taipei, Taiwan, pp. 167-172.
View/Download from: Publisher's site
View description>>
Canny edge detector is the most popular tool for edge detection and has many applications in the areas of image processing, multimedia and computer vision. The Canny algorithm optimizes the edge detection through noise filtering using an optimal function approximated by the first derivative of a Gaussian. It identifies the edge points by computing the gradients of light intensity function based on the fact that the edge points likely appear where the gradient magnitudes are large. Hexagonal structure is an image structure alternative to traditional square image structure. Because all the existing hardware for capturing image and for displaying image are produced based on square structure, an approach that uses linear interpolation described for conversion between square and hexagonal structures. Gaussian filtering together with gradient computation is performed on the hexagonal structure. The pixel edge strengths on the square structure are then estimated before the thresholds of Canny algorithm are applied to determine the final edge map. The experimental results show the edge detection on hexagonal structure using static and video images, and the comparison with the results using Canny algorithm on square structure. ©2009 IEEE.
Xiao, Y, Liu, B, Cao, L, Wu, X, Zhang, C, Hao, Z, Yang, F & Cao, J 1970, 'Multi-sphere Support Vector Data Description for Outliers Detection on Multi-distribution Data', 2009 IEEE International Conference on Data Mining Workshops, 2009 IEEE International Conference on Data Mining Workshops (ICDMW), IEEE, Miami, Florida, pp. 82-87.
View/Download from: Publisher's site
View description>>
SVDD has been proved a powerful tool for outlier detection. However, in detecting outliers on multi-distribution data, namely there are distinctive distributions in the data, it is very challenging for SVDD to generate a hyper-sphere for distinguishing outliers from normal data. Even if such a hyper-sphere can be identified, its performance is usually not good enough. This paper proposes an multi-sphere SVDD approach, named MS-SVDD, for outlier detection on multi-distribution data. First, an adaptive sphere detection method is proposed to detect data distributions in the dataset. The data is partitioned in terms of the identified data distributions, and the corresponding SVDD classifiers are constructed separately. Substantial experiments on both artificial and real-world datasets have demonstrated that the proposed approach outperforms original SVDD. © 2009 IEEE.
Zhang, J, Paisitkriangkrai, S & Shen, C 1970, 'An overview of fast pedestrian detection: Feature selection and cascade framework of boosted features', 2009 IEEE International Conference on Multimedia and Expo, 2009 IEEE International Conference on Multimedia and Expo (ICME), IEEE, New York, NY, pp. 1566-1567.
View/Download from: Publisher's site
Zhang, Z, Gunes, H, Piccardi, M & IEEE 1970, 'HEAD DETECTION FOR VIDEO SURVEILLANCE BASED ON CATEGORICAL HAIR AND SKIN COLOUR MODELS', 2009 16TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOLS 1-6, IEEE International Conference on Image Processing, IEEE, Cairo, Egypt, pp. 1137-1140.
View/Download from: Publisher's site
View description>>
We propose a new robust head detection algorithm that is capable of handling significantly different conditions in terms of viewpoint, tilt angle, scale and resolution. To this aim, we built a new model for the head based on appearance distributions and shape constraints. We construct a categorical model for hair and skin, separately, and train the models for four categories of hair (brown, red, blond and black) and three categories of skin representing the different illumination conditions (bright, standard and dark). The shape constraint fits an elliptical model to the candidate region and compares its parameters with priors based on human anatomy. The experimental results validate the usability of the proposed algorithm in various video surveillance and multimedia applications. ©2009 IEEE.
Zhao, G, Xiong, Y, Cao, L, Luo, D, Su, X & Zhu, Y 1970, 'A Cost-Effective LSH Filter for Fast Pairwise Mining', 2009 Ninth IEEE International Conference on Data Mining, 2009 Ninth IEEE International Conference on Data Mining (ICDM), IEEE, Miami, Florida, USA, pp. 1088-1093.
View/Download from: Publisher's site
View description>>
The pairwise mining problem is to discover pairwise objects having measures greater than the user-specified minimum threshold from a collection of objects. It is essential in a large variety of database and data-mining applications. Of late, there has been increasing interest in applying a Locality- Sensitive Hashing (LSH) scheme for pairwise mining. LSH-type methods have shown themselves to be simply implementable and capable of achieving significant performance gain in running time over most exact methods. However, the present LSH-type methods still suffer from some bottlenecks, such as 'the curse of threshold'. In this paper, we proposed a novel LSH-based method, namely Cost-effective LSH filter (Ce-LSH for short), for pairwise mining. Compared with previous LSH-type methods, it uses a lower fixed number of LSH functions and is thus more cost-effective. Substantial experiments evidence that our method gives significant improvement in running time over existing LSH-type methods and some recently reported method based on upper-bound. Experimental results also indicate that it scales well even for a relatively low minimum threshold and for a fairly small miss ratio. © 2009 IEEE.
Zhao, L & Li, J 1970, 'Sequence-based B-cell epitope prediction by using associations in antibody-antigen structural complexes', 2009 IEEE International Conference on Bioinformatics and Biomedicine Workshop, 2009 IEEE International Conference on Bioinformatics and Biomedicine Workshop, BIBMW, IEEE, pp. 165-172.
View/Download from: Publisher's site
View description>>
B-cell secreted antibodies play a critical role in fighting against the invaders and abnormal self tissues. Identifying the epitope on antigens recognized by the paratope on antibodies can enlighten the understanding of this important immune mechanism. Predicting B-cell epitope can also pave the way for vaccine design and disease therapy. However, due to the high complexity of this problem, previous prediction methods that focus on linear and conformational epitope are both unsatisfactory. In this work, we propose a novel method to predict B-cell epitopes, when a pair of sequences is given, by using associations and cooperativity patterns from a relatively small antigen-antibody structural data set. More exactly, our classifier is trained on only PDB protein complexes, but it can be applied to any sequence data. Our evaluation results show that the accuracy of our method is very competitive to, sometimes even much better than, previous structure-based prediction methods which have a smaller applicability scope than ours. ©2009 IEEE.
Zhao, Y, Zhang, H, Cao, L, Zhang, C & Bohlscheid, H 1970, 'Mining Both Positive and Negative Impact-Oriented Sequential Rules from Transactional Data', Advances in Knowledge Discovery and Data Mining, 13th Pacific-Asia Conference, PAKDD 2009, Pacific-Asia Conference on Knowledge Discovery and Data Mining, Springer Berlin Heidelberg, Bangkok, Thailand, pp. 656-663.
View/Download from: Publisher's site
View description>>
Traditional sequential pattern mining deals with positive correlation between sequential patterns only, without considering negative relationship between them. In this paper, we present a notion of impact-oriented negative sequential rules, in which the left side is a positive sequential pattern or its negation, and the right side is a predefined outcome or its negation. Impact-oriented negative sequential rules are formally defined to show the impact of sequential patterns on the outcome, and an efficient algorithm is designed to discover both positive and negative impact-oriented sequential rules. Experimental results on both synthetic data and real-life data show the efficiency and effectiveness of the proposed technique.
Zhao, Y, Zhang, H, Wu, S, Pei, J, Cao, L, Zhang, C & Bohlscheid, H 1970, 'Debt Detection in Social Security by Sequence Classification Using Both Positive and Negative Patterns', Machine Learning and Knowledge Discovery in Databases, European Conference, ECML PKDD 2009, European Conference on Machine Learning, Springer Berlin Heidelberg, Bled, Slovenia, pp. 648-663.
View/Download from: Publisher's site
View description>>
Debt detection is important for improving payment accuracy in social security. Since debt detection from customer transactional data can be generally modelled as a fraud detection problem, a straightforward solution is to extract features from transaction sequences and build a sequence classifier for debts. The existing sequence classification methods based on sequential patterns consider only positive patterns. However, according to our experience in a large social security application, negative patterns are very useful in accurate debt detection. In this paper, we present a successful case study of debt detection in a large social security application. The central technique is building sequence classification using both positive and negative sequential patterns.
Zheng, Z, Zhao, Y, Zuo, Z & Cao, L 1970, 'Negative-GSP: An efficient method for mining negative sequential patterns', Conferences in Research and Practice in Information Technology Series, Australian Data Mining Conference, Australian Computer Society, Melbourne, Australia, pp. 63-67.
View description>>
Different from traditional positive sequential pattern mining, negative sequential pattern mining considers both positive and negative relationships between items. Negative sequential pattern mining doesn't necessarily follow the Apriori principle, and the searching space is much larger than positive pattern mining. Giving definitions and some constraints of negative sequential patterns, this paper proposes a new method for mining negative sequential patterns, called Negative-GSP. Negative-GSP can find negative sequential patterns effectively and efficiently by joining and pruning, and extensive experimental results show the efficiency of the method. © 2009, Australian Computer Society, Inc.
Zhou, Q, Xu, G & Zong, Y 1970, 'Web Co-clustering of Usage Network Using Tensor Decomposition', 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology, 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology, IEEE, pp. 311-314.
View/Download from: Publisher's site
View description>>
Web clustering is an approach to aggregate web objects into various categories according to underlying relationships among them. Finding co-clusters of web objects is an emerging topic in the context of web usage mining. In this paper we will present an algorithm using tensor decomposition to co-cluster web objects based on analysis of user navigational tasks. The usage data of users visiting web sites is collected as experimental data to construct the usage network and validate the presented method. Experimental results have demonstrated the proposed method can clearly reveal the aggregations of web objects as a result of different navigational tasks. © 2009 IEEE.
Cao, L 2009, 'Data mining and multi-agent integration', pp. 1-328.
View/Download from: Publisher's site
View description>>
Data Mining and Multi-agent Integration presents cutting-edge research, applications and solutions in data mining, and the practical use of innovative information technologies written by leading international researchers in the field. Topics examined include: Integration of multiagent applications and data mining Mining temporal patterns to improve agents behavior Information enrichment through recommendation sharing Automatic web data extraction based on genetic algorithms and regular expressions A multiagent learning paradigm for medical data mining diagnostic workbench A multiagent data mining framework Streaming data in complex uncertain environments Large data clustering A multiagent, multi-objective clustering algorithm Interactive web environment for psychometric diagnostics Anomalies detection on distributed firewalls using data mining techniques Automated reasoning for distributed and multiple source of data Video contents identification Data Mining and Multi-agent Integration is intended for students, researchers, engineers and practitioners in the field, interested in the synergy between agents and data mining. This book is also relevant for readers in related areas such as machine learning, artificial intelligence, intelligent systems, knowledge engineering, human-computer interaction, intelligent information processing, decision support systems, knowledge management, organizational computing, social computing, complex systems, and soft computing. © Springer Science+Business Media, LLC 2009. All rights reserved.
Cao, L, Yu, PS, Zhang, C & Zhang, H 2009, 'Data Mining for Business Applications', Springer US, pp. 1-302.
View/Download from: Publisher's site
View description>>
Data Mining for Business Applications presents state-of-the-art data mining research and development related to methodologies, techniques, approaches and successful applications. The contributions of this book mark a paradigm shift from "data-centered pattern mining" to "domain-driven actionable knowledge discovery (AKD)" for next-generation KDD research and applications. The contents identify how KDD techniques can better contribute to critical domain problems in practice, and strengthen business intelligence in complex enterprise applications. The volume also explores challenges and directions for future data mining research and development in the dialogue between academia and business. Part I centers on developing workable AKD methodologies, including: domain-driven data mining post-processing rules for actions domain-driven customer analytics the role of human intelligence in AKD maximal pattern-based cluster ontology mining Part II focuses on novel KDD domains and the corresponding techniques, exloring the mining of emergent areas and domains such as: social security data community security data gene sequences mental health information traditional Chinese medicine data cancer related data blog data sentiment information web data procedures moving object trajectories land use mapping higher education data flight scheduling algorithmic asset management Researchers, practitioners and university students in the areas of data mining and knowledge discovery, knowledge engineering, human-computer interaction, artificial intelligence, intelligent information processing, decision support systems, knowledge management, and KDD project management are sure to find this a practical and effective means of enhancing their understanding of and using data mining in their own projects. © 2009 Springer Science+Business Media, LLC All rights reserved.