Yang YangPersonal Website
.01

ABOUT (CV here)

PERSONAL DETAILS
600 Foster Street, Evanston, IL
yang.yang@kellogg.northwestern.edu
Hello. I am a Researcher Network Scientist Data Scientist
I am passionate about network science
Welcome to my Personal and Academic profile
Available as Post-doctor

BIO

ABOUT ME

Yang Yang, Ph.D., is passionate about network science and data mining research. He received his B.S. in Nanjing University in 2007. He received his Ph.D. of computer science at the University of Notre Dame in 2015, advised by Professor Nitesh V. Chawla. He was a member of the iCeNSA research group and the Data, Inference, Analysis and Learning (DIAL) research group. Currently he is a postdoctoral fellow at Kellog School of Management and NICO (Northwestern Institute on Complex Systems), working with Professor Brian Uzzi.

His principal research interest is in large-scale information and social networks. More generally, he studies data mining, statistics, and network science, with a focus on the link prediction problem, social influence analysis, and social network evolution. His research has been focused on new methodologies of predicting links in social networks and exploring underlying principles of social network evolution. Recently he is exploring the connection between social networking effects and early career success of business elites, and the mechanisms underlying the dynamics of global terrorisms. He is also involved with interdisciplinary projects across multiple other fields, i.e., analyzing the development of complex socio-economic systems, such as the evolution of trader alliance networks in unstable economies.

HOBBIES

INTERESTS

Culinary Art.

Music.

Movies.

FACTS

NUMBERS ABOUT ME

920
CUPS OF COFFEE
65
PROJECTS COMPLETED
2965
HOURS OF CODING
35
WORKSHOPS
2M
LINES OF CODE
100
SATISFIED CUSTOMERS

.02

RESUME (CV here)

  • EDUCATION
  • 2010
    2015
    NOTRE DAME, USA

    COMPUTER SCIENCE - Ph.D

    UNIVERSITY OF NOTRE DAME

  • 2003
    2007
    Nanjing, CHINA

    COMPUTER SCIENCE - BACHELOR

    NANJING UNIVERSITY

  • ACADEMIC AND PROFESSIONAL POSITIONS
  • 2016
    2015
    Evanston, USA

    POSTDOCTORAL FELLOW

    NORTHWESTERN UNIVERSITY

  • 2015
    2015
    Notre Dame, USA

    RESEARCH FELLOW

    UNIVERSITY OF NOTRE DAME

  • 2011
    2015
    Notre Dame, USA

    RESEARCH ASSISTANT

    UNIVERSITY OF NOTRE DAME

  • 2010
    2011
    Notre Dame, USA

    TEACHING ASSISTANT

    UNIVERSITY OF NOTRE DAME

  • PROFESSIONAL SERVICES AND OUTREACH
  • Now
    2016

    Editorial Review Board

    International Journal of Business and Management

  • Now
    2016

    Editorial Review Board

    Artificial Intelligence Research

  • 2016
    2016

    Program Committee Member

    AAAI Conference

  • 2015
    2015

    Program Committee Member

    BigDataSE Conference

  • 2013
    2013

    Program Committee Member

    HINA-IJCAI Workshop

  • 2016
    2016

    Reviewer

    IJSNM (International Journal of Social Network Mining)

  • 2016
    2016

    Reviewer

    TNNLS (Transactions on Neural Networks and Learning Systems)

  • 2016
    2016

    Reviewer

    IDA (Intelligent Data Analysis)

  • 2016
    2015

    Reviewer

    TKDE (IEEE Transactions on Knowledge and Data Engineering)

  • 2016
    2015

    Reviewer

    TKDD (ACM Transactions on Knowledge Discovery from Data)

  • 2016
    2015

    Reviewer

    SNAM (Social Network Analysis and Mining)

  • 2016
    2014

    Reviewer

    DMKD (Data Mining and Knowledge Discovery)

  • GRANTS
  • 2013
    2014
    Notre Dame, USA

    MOBILE MONEY AND COMING OF AGE IN WESTERN KENYA

    BILL AND MELINDA GATES FOUNDATION

    No. OPP1031657, subcontract No. 2014-3074
  • HONORS AND AWARDS
  • 2014
    2015
    Notre Dame, USA

    OUTSTANDING RESEARCH ASSISTANT

    DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

    Research Assistant with Outstanding Achievement Selected by Faculty Members
  • 2010
    2011
    Notre Dame, USA

    BEST POSTER AWARD

    DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

    Sixth Annual Student Research Symposium poster contest, CSE department, University of Notre Dame.
  • 2007
    2008
    Shanghai, CHINA

    KNIGHT IN TOWN FROM INFOSYS

    INFOSYS, CHINA

    Annual Outstanding Employee
  • 2006
    2007
    Nanjing, CHINA

    THE EXCELLENT STUDENT AWARD

    Nanjing University

  • 2006
    2007
    Nanjing, CHINA

    THE EXCELLENT GRADUATE AWARD

    Nanjing University

  • 2005
    2006
    Nanjing, CHINA

    THE NATIONAL SCHOLARSHIP

    Nanjing University

  • 2004
    2005
    Nanjing, CHINA

    THE PEOPLE'S SCHOLARSHIP

    Nanjing University

  • 2003
    2004
    Nanjing, CHINA

    THE PEOPLE'S SCHOLARSHIP

    Nanjing University

.03

PUBLICATIONS

PUBLICATIONS LIST
30 JUN 2016

Hearthholds of Mobile Money in Western Kenya

Economic Anthropology 3(2):266-279

Kenyans use mobile money services to transfer money to friends and relatives via mobile phone text messaging. Kenya¡¯s M-Pesa is one of the most successful examples of digital money for financial inclusion. This article uses social network analysis and ethnographic information to examine ties to and through women in 12 mobile money transfer networks of kin, drawn from field data collected in 2012, 2013, and 2014.

Journal Paper Sibel Kusimba, Yang Yang, Nitesh V. Chawla
img

Hearthholds of Mobile Money in Western Kenya

Sibel Kusimba, Yang Yang, Nitesh V. Chawla Journal Paper

Kenyans use mobile money services to transfer money to friends and relatives via mobile phone text messaging. Kenya’s M-Pesa is one of the most successful examples of digital money for financial inclusion. This article uses social network analysis and ethnographic information to examine ties to and through women in 12 mobile money transfer networks of kin, drawn from field data collected in 2012, 2013, and 2014. The social networks are based on reciprocal and dense ties among siblings and parents, especially mothers. Men participate equally in social networks, but as brothers and mother’s brothers more often than as fathers. The matrilineal ties of mobile money circulate value within the hearthhold (Ekejiuba 2005) of women, their children, and others connected to them. Using remittances, families negotiate investments in household farming or work, education, and migration. Money sending supports the diverse economic strategies, flexible kinship ties, and mobility of hearthholds. Gifts of e-money are said to express a natural love and caring among mothers and siblings and are often private and personal. Consequently, the money circulations of the hearthhold avoid disrupting widely shared ideals of patrilineal solidarity and household autonomy.

25 JUN 2016

The Global Terrorism Network: Power Law Foundations of System Behavior

2nd Annual International Conference on Computational Social Science

Current work on global terrorism networks have argued that interventions and predictions can be based on the finding that the severity of attacks follow a power law. Using a dramatically more comprehensive dataset on global terrorism than previously available, we find that the distribution of the severity of attacks does not conform to power laws distributions when looked at over time.

TALKS/POSTERS Yang Yang, Adam Pah and Brian Uzzi
img

The Global Terrorism Network: Power Law Foundations of System Behavior

Yang Yang, Adam Pah and Brian Uzzi TALKS/POSTERS

Current work on global terrorism networks have argued that interventions and predictions can be based on the finding that the severity of attacks follow a power law \cite{clauset:2007}. Using a dramatically more comprehensive dataset on global terrorism than previously available, we find that the distribution of the severity of attacks does not conform to power laws distributions when looked at over time, 1970 to today, and when further decomposed into national and international terrorist network. Specifically, we find that a power law has limited explanatory power about the global-scale terrorist attacks and its application is context dependent. While the severity of attacks carried out by international groups may follow a power law, domestic groups appear to follow an exponential distribution which lacks the same underlying mechanism. This suggests that governmental efforts and targeting practices should change given the nature of the group involved.

25 JUN 2016

The Formation and Imprinting of Network Effects Among the Business Elites

2nd Annual International Conference on Computational Social Science

We collected and analyzed more than 4.5 million time-stamped emails from students at a globally top-ranked MBA program, focusing specifically on the relationship between students' evolving communication networks and their subsequent career outcomes.

TALKS/POSTERS Brian Uzzi, Yang Yang, Kevin Gaughan
img

The Formation and Imprinting of Network Effects Among the Business Elites

Brian Uzzi, Yang Yang, Kevin Gaughan TALKS/POSTERS

The ``business elite'' constitutes a small but strikingly influential subset of the population, oftentimes affecting important societal outcomes such as the consolidation of political power \cite{padgett:1993}, the adoption of corporate governance practices, and the stability of national economies more broadly. Research has shown that this exclusive community often resembles a densely structured network, where elites exchange privileged access to capital, market information, and political clout in an attempt to preserve their economic interests and maintain the status quo \cite{useem:1982}. While there is general awareness that connections among the business elite arise because ``elites attend the same schools, belong to the same clubs, and in general are in the same place at the same time'', surprisingly little is known about the network dynamics that emerge within these formative settings. Here we analyze a unique dataset of all MBA students at a top 5 MBA program. Students were randomly assigned to their first classes; friendship among students prior to coming into the program was rare; and the network data – email transmissions among students – were collected for the year 2006 when students almost entirely used the school's email server to communicate, thereby providing an excellent proxy for their networks. After matching students on all available characteristics (e.g., age, grade scores, industry experience, etc.) — i.e. creating ``twin pairs'' — we find that the distinguishing characteristics between students who do well in job placement and those who do not is their network. Further, we find that the network differences between the successful and unsuccessful students develops within the first month of class and persists thereafter, suggesting a network imprinting that is persistent. Finally, we find that these effects are pronounced for students who are at the extreme ends of the distribution on other measures of success – students with the best expected job placement do particularly poorly without the right network (``descenders''), whereas students with worst expected job placement pull themselves to the top of the placement hierarchy (``ascenders'') with the right network.

26 MAY 2016

Influence Activation Model: A New Perspective in Social Influence Analysis and Social Network Evolution

Under Submission

What drives the propensity for the social network dynamics? Social influence is believed to drive both off-line and on-line human behavior, however it has not been considered as a driver of social network evolution. Our analysis suggest that, while the network structure affects the spread of influence in social networks, the network is in turn shaped by social influence activity (i.e., the process of social influence wherein one person's attitudes and behaviors affect another's).

Journal Paper Yang Yang et.al
img

Influence Activation Model: A New Perspective in Social Influence Analysis and Social Network Evolution

Yang Yang, Nitesh V. Chawla, Ryan N. Lichtenwalter, Yuxiao Dong Journal Paper

What drives the propensity for the social network dynamics? Social influence is believed to drive both off-line and on-line human behavior, however it has not been considered as a driver of social network evolution. Our analysis suggest that, while the network structure affects the spread of influence in social networks, the network is in turn shaped by social influence activity (i.e., the process of social influence wherein one person's attitudes and behaviors affect another's). To that end, we develop a novel model of network evolution where the dynamics of network follow the mechanism of influence propagation, which are not captured by the existing network evolution models. Our experiments confirm the predictions of our model and demonstrate the important role that social influence can play in the process of network evolution. As well exploring the reason of social network evolution, different genres of social influence have been spotted having different effects on the network dynamics. These findings and methods are essential to both our understanding of the mechanisms that drive network evolution and our knowledge of the role of social influence in shaping the network structure.

10 SEPT 2015

Family networks of mobile money in Kenya

Information Technologies & International Development 11(3), 2015

This research examines the interplay between social networks and mobile money remittances in Western Kenya. Research was conducted in Kenya's Bungoma and Trans-Nzoia counties in 2012, 2013, and 2014, involving 12 family networks of between 8-70 people.

Journal Paper Sibel B Kusimba, Yang Yang, Nitesh V Chawla
img

Family networks of mobile money in Kenya

Sibel B Kusimba, Yang Yang, Nitesh V Chawla Journal Paper

This research examines the interplay between social networks and mobile money remittances in Western Kenya. Research was conducted in Kenya’s Bungoma and Trans-Nzoia counties in 2012, 2013, and 2014, involving 12 family networks of between 8–70 people. Using small and frequent digital money transfers, relatives provide for household and emergency needs, contribute to ceremonies, and help pay school fees and medical bills. We find that digital money transfers follow and reinforce preexisting forms of emotional support and social relationships. In these families, the transfers strengthen maternal kinship ties as well relationships among siblings and cousins. Money networks are reciprocal, such that senders are also receivers, and individuals have many connections through which to access resources. Some individuals are “central” in networks, having more connections; others broker ºows of e-value from one group of relatives to another. Mobile money strengthens social bonds but can also disrupt social relationships as when hiding digital value and remittances from in-laws or spouses.

07 SEPT 2015

The Evolution of Social Relationships and Strategies Across the Lifespan

Machine Learning and Knowledge Discovery in Databases 9286, pp245-249

In this work, we unveil the evolution of social relationships across the lifespan. This evolution reflects the dynamic social strategies that people use to fulfill their social needs. For this work we utilize a large mobile network complete with user demographic information.

Book Chapters Yuxiao Dong, Nitesh V. Chawla, Jie Tang, Yang Yang, Yang Yang
img

The Evolution of Social Relationships and Strategies Across the Lifespan

Yuxiao Dong, Nitesh V. Chawla, Jie Tang, Yang Yang, Yang Yang Journal Paper

In this work, we unveil the evolution of social relationships across the lifespan. This evolution reflects the dynamic social strategies that people use to fulfill their social needs. For this work we utilize a large mobile network complete with user demographic information. We find that while younger individuals are active in broadening their social relationships, seniors tend to keep small but closed social circles. We further demonstrate that opposite-gender interactions between two young individuals are much more frequent than those between young same-gender people, while the situation is reversed after around 35 years old. We also discover that while same-gender triadic social relationships are persistently maintained over a lifetime, the opposite-gender triadic circles are unstable upon entering into middle-age. Finally we demonstrate a greater than 80% potential predictability for inferring users’ gender and a 73% predictability for age from mobile communication behaviors.

25 AUG 2015

Collaboration Signatures Reveal Scientific Impact

Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2015

Collaboration is an integral element of the scientific process that often leads to findings with significant impact. While extensive efforts have been devoted to quantifying and predicting research impact, the question of how collaborative behavior influences scientific impact remains unaddressed. In this work, we study the interplay between scientists' collaboration signatures and their scientific impact.

CONFERENCES Yuxiao Dong, Reid A Johnson, Yang Yang, Nitesh V Chawla
img

Collaboration Signatures Reveal Scientific Impact

Yuxiao Dong, Reid A Johnson, Yang Yang, Nitesh V Chawla CONFERENCES

Collaboration is an integral element of the scientific process that often leads to findings with significant impact. While extensive efforts have been devoted to quantifying and predicting research impact, the question of how collaborative behavior influences scientific impact remains unaddressed. In this work, we study the interplay between scientists' collaboration signatures and their scientific impact. As the basis of our study, we employ an ArnetMiner dataset with more than 1.7 million authors and 2 million papers spanning over 60 years. We formally define a scientist's collaboration signature as the distribution of collaboration strengths with each collaborator in his or her academic ego network, which is quantified by four measures: sociability, dependence, diversity, and self-collaboration. We then demonstrate that the collaboration signature allows us to effectively distinguish between researchers with dissimilar levels of scientific impact. We also discover that, even from the early stages of one's researcher career, a scientist's collaboration signature can help to reveal his or her future scientific impact. Finally, we find that as a representative group of outstanding computer scientists, Turing Award winners collectively produce distinctive collaboration signatures throughout the entirety of their careers. Our conclusions on the relationship between collaboration signatures and scientific impact give rise to important implications for researchers who wish to expand their scientific impact and more effectively stand on the shoulders of "collaborators."

30 MAR 2015

Inferring Social Status and Rich Club Effects in Enterprise Communication Networks

PloS one 10 (3), e0119446

In this paper, we consider the notion of status from the perspective of a position or title held by a person in an enterprise. We study the intersection of social status and social networks in an enterprise. We study whether enterprise communication logs can help reveal how social interactions and individual status manifest themselves in social networks.

Journal Paper Yuxiao Dong, Jie Tang, Nitesh V. Chawla, Tiancheng Lou, Yang Yang, Bai Wang
img

Inferring Social Status and Rich Club Effects in Enterprise Communication Networks

Yuxiao Dong, Jie Tang, Nitesh V. Chawla, Tiancheng Lou, Yang Yang, Bai Wang Journal Paper

Social status, defined as the relative rank or position that an individual holds in a social hierarchy, is known to be among the most important motivating forces in social behaviors. In this paper, we consider the notion of status from the perspective of a position or title held by a person in an enterprise. We study the intersection of social status and social networks in an enterprise. We study whether enterprise communication logs can help reveal how social interactions and individual status manifest themselves in social networks. To that end, we use two enterprise datasets with three communication channels ¡ª voice call, short message, and email ¡ª to demonstrate the social-behavioral differences among individuals with different status. We have several interesting findings and based on these findings we also develop a model to predict social status. On the individual level, high-status individuals are more likely to be spanned as structural holes by linking to people in parts of the enterprise networks that are otherwise not well connected to one another. On the community level, the principle of homophily, social balance and clique theory generally indicate a ¡°rich club¡± maintained by high-status individuals, in the sense that this community is much more connected, balanced and dense. Our model can predict social status of individuals with 93% accuracy.

28 NOV 2014

Predicting Node Degree Centrality with the Node Prominence Profile

Nature Scientific Reports 4, Article number: 7236

Centrality of a node measures its relative importance within a network. We develop a method that reconciles preferential attachment and triadic closure to capture a node's prominence profile. We show that the proposed node prominence profile method is an effective predictor of degree centrality.

Journal Paper Selected Yang Yang, Yuxiao Dong, Nitesh V. Chawla
img

Predicting Node Degree Centrality with the Node Prominence Profile

Yang Yang, Yuxiao Dong, Nitesh V. Chawla Journal Paper Selected

Centrality of a node measures its relative importance within a network. There are a number of applications of centrality, including inferring the influence or success of an individual in a social network, and the resulting social network dynamics. While we can compute the centrality of any node in a given network snapshot, a number of applications are also interested in knowing the potential importance of an individual in the future. However, current centrality is not necessarily an effective predictor of future centrality. While there are different measures of centrality, we focus on degree centrality in this paper. We develop a method that reconciles preferential attachment and triadic closure to capture a node's prominence profile. We show that the proposed node prominence profile method is an effective predictor of degree centrality. Notably, our analysis reveals that individuals in the early stage of evolution display a distinctive and robust signature in degree centrality trend, adequately predicted by their prominence profile. We evaluate our work across four real-world social networks. Our findings have important implications for the applications that require prediction of a node's future degree centrality, as well as the study of social network dynamics.

img
09 OCT 2014

Evaluating Link Prediction Methods

Knowledge and Information Systems (KAIS'14), Springer, DOI: 10.1007/s10115-014-0789-0

Link prediction is a popular research area with important applications in a variety of disciplines, including biology, social science, security, and medicine. We describe these challenges, provide theoretical proofs and empirical examples demonstrating how current methods lead to questionable conclusions, show how the fallacy of these conclusions is illuminated by methods we propose, and develop recommendations for consistent, standard, and applicable evaluation metrics.

Journal Paper Selected Yang Yang, Ryan N. Lichtenwalter, Nitesh V. Chawla
img

Evaluating Link Prediction Methods

Yang Yang, Ryan N. Lichtenwalter, Nitesh V. Chawla Journal Paper Selected

Link prediction is a popular research area with important applications in a variety of disciplines, including biology, social science, security, and medicine. The fundamental requirement of link prediction is the accurate and effective prediction of new links in networks. While there are many different methods proposed for link prediction, we argue that the practical performance potential of these methods is often unknown because of challenges in the evaluation of link prediction, which impact the reliability and reproducibility of results. We describe these challenges, provide theoretical proofs and empirical examples demonstrating how current methods lead to questionable conclusions, show how the fallacy of these conclusions is illuminated by methods we propose, and develop recommendations for consistent, standard, and applicable evaluation metrics. We also recommend the use of precision-recall threshold curves and associated areas in lieu of receiver operating characteristic curves due to complications that arise from extreme imbalance in the link prediction classification problem.

5 OCT 2014

Link Prediction: A Primer

Encyclopedia of Social Network Analysis and Mining

Link prediction is an important task in network analysis, benefiting researchers and organizations in a variety of fields. There are a variety of techniques for link prediction, ranging from feature-based classification to probabilistic models and matrix factorization. In this entry, we mainly discuss how to solve the link prediction problem as a supervised classification task.

Book Chapters Nitesh V. Chawla, Yang Yang
img

Link Prediction: A Primer

Nitesh V. Chawla, Yang Yang Book Chapters

Link prediction is an important task in network analysis, benefiting researchers and organizations in a variety of fields. There are a variety of techniques for link prediction, ranging from feature-based classification to probabilistic models and matrix factorization. In this entry, we mainly discuss how to solve the link prediction problem as a supervised classification task.

24 AUG 2014

Inferring User Demographics and Social Strategies in Mobile Social Networks

Proc. of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD'14)

Demographics are widely used in marketing to characterize different types of customers. However, in practice, demographic information such as age, gender, and location is usually unavailable due to privacy and other reasons. We discover several interesting social strategies that mobile users frequently use to maintain their social connections.

CONFERENCES Yuxiao Dong, Yang Yang, Jie Tang, Yang Yang, Nitesh V. Chawla
img

Inferring User Demographics and Social Strategies in Mobile Social Networks

Yuxiao Dong, Yang Yang, Jie Tang, Yang Yang, Nitesh V. Chawla CONFERENCES

Demographics are widely used in marketing to characterize different types of customers. However, in practice, demographic information such as age, gender, and location is usually unavailable due to privacy and other reasons. In this paper, we aim to harness the power of big data to automatically infer users' demographics based on their daily mobile communication patterns. Our study is based on a real-world large mobile network of more than 7,000,000 users and over 1,000,000,000 communication records (CALL and SMS). We discover several interesting social strategies that mobile users frequently use to maintain their social connections. First, young people are very active in broadening their social circles, while seniors tend to keep close but more stable connections. Second, female users put more attention on cross-generation interactions than male users, though interactions between male and female users are frequent. Third, a persistent same-gender triadic pattern over one's lifetime is discovered for the first time, while more complex opposite-gender triadic patterns are only exhibited among young people.

We further study to what extent users' demographics can be inferred from their mobile communications. As a special case, we formalize a problem of double dependent-variable prediction-inferring user gender and age simultaneously. We propose the WhoAmI method, a Double Dependent-Variable Factor Graph Model, to address this problem by considering not only the effects of features on gender/age, but also the interrelation between gender and age. Our experiments show that the proposed WhoAmI method significantly improves the prediction accuracy by up to 10% compared with several alternative methods.

1 DEC 2013

Perspective on Measurement Metrics for Community Detection Algorithms

Mining Social Networks and Security Informatics

In this chapter we present the performance of community detection algorithms on real world networks and their corresponding benchmark networks, which are designed to demonstrate the differences between real world networks and benchmark networks.

Book Chapters Yang Yang, Yizhou Sun, Saurav Pandit, Nitesh V. Chawla, Jiawei Han
img

Perspective on Measurement Metrics for Community Detection Algorithms

Yang Yang, Yizhou Sun, Saurav Pandit, Nitesh V. Chawla, Jiawei Han Book Chapters

Community detection or cluster detection in networks is often at the core of mining network data. Whereas the problem is well-studied, given the scale and complexity of modern day social networks, detecting "reasonable" communities is often a hard problem. Since the first use of k-means algorithm in 1960s, many community detection algorithms have been presented-most of which are developed with specific goals in mind and the idea of detecting meaningful communities varies widely from one algorithm to another.

As the number of clustering algorithms grows, so does the number of metrics on how to measure them. Algorithms are often reduced to optimizing the value of an objective function such as modularity and internal density. Some of these metrics rely on ground-truth, some do not. In this chapter we study these algorithms and aim to find whether these optimization based measurements are consistent with the real performance of community detection algorithm. Seven representative algorithms are compared under various performance metrics, and on various real world social networks.

The difficulties of measuring community detection algorithms are mostly due to the unavailability of ground-truth information, and then objective functions, such as modularity, are used as substitutes. The benchmark networks that simulate real world networks with planted community structure are introduced to tackle the unavailability of ground-truth information, however whether the simulation is precise and useful has not been verified. In this chapter we present the performance of community detection algorithms on real world networks and their corresponding benchmark networks, which are designed to demonstrate the differences between real world networks and benchmark networks.

18 NOV 2013

Red Black Network: Temporal and Topological Analysis of Two Intertwined Social Networks

32nd Military Communications Conference (MILCOM'13)

In this paper we introduce and study the properties of certain kind of interdependent networks that we collectively call a Red Black Network - two intertwined social networks that work together towards a series of events (missions or performances). We find that the statistical properties of two such networks are highly correlated, and use that finding to devise a prediction mechanism for such properties in a scenario when one of the two networks is invisible or only partially visible.

CONFERENCES Saurav Pandit, Jonathan Koch, Yang Yang, Nitesh V. Chawla, Brian Uzzi
img

Red Black Network: Temporal and Topological Analysis of Two Intertwined Social Networks

Saurav Pandit, Jonathan Koch, Yang Yang, Nitesh V. Chawla, Brian Uzzi CONFERENCES

In this paper we introduce and study the properties of certain kind of interdependent networks that we collectively call a Red Black Network - two intertwined social networks that work together towards a series of events (missions or performances). More specifically, members of one of the two networks is responsible for planning and organizing the events. They will generally be referred to as artists. Members of the other network, henceforth called actors, are responsible for the execution of the events. Using temporal data from the performing arts industry, we study the co-evolution of two such co-dependent social networks. We find that the statistical properties of two such networks are highly correlated, and use that finding to devise a prediction mechanism for such properties in a scenario when one of the two networks is invisible or only partially visible. This also sets up a framework for our ultimate goal of temporal, semi-blind, multi-relational link prediction.

25 AUG 2013

Link Prediction in Human Mobility Networks

The International Conference on Advances in Social Networks Analysis and Mining (ASONAM'13)

The understanding of how humans move is a long-standing challenge in the natural science. In this paper we study the human mobility behaviors from the perspective of network science. We explore problems by using link prediction technology, our methodology is demonstrated to have a greater degree of precision in predicting future mobility topology.

CONFERENCES Yang Yang, Nitesh V. Chawla, Prithwish Basu, Bhaskar Prabhala, Thomas La Porta
img

Link Prediction in Human Mobility Networks

Yang Yang, Nitesh V. Chawla, Prithwish Basu, Bhaskar Prabhala, Thomas La Porta CONFERENCES

The understanding of how humans move is a long-standing challenge in the natural science. An important question is, to what degree is human behavior predictable? The ability to foresee the mobility of humans is crucial from predicting the spread of human to urban planning. Previous research has focused on predicting individual mobility behavior, such as the next location prediction problem. In this paper we study the human mobility behaviors from the perspective of network science. In the human mobility network, there will be a link between two humans if they are physically proximal to each other. We perform both microscopic and macroscopic explorations on the human mobility patterns. From the microscopic perspective, our objective is to answer whether two humans will be in proximity of each other or not. While from the macroscopic perspective, we are interested in whether we can infer the future topology of the human mobility network. In this paper we explore both problems by using link prediction technology, our methodology is demonstrated to have a greater degree of precision in predicting future mobility topology.

10 DEC 2012

Predicting Links in Multi-Relational and Heterogeneous Networks

Proc. of the 12th IEEE International Conference on Data Mining (ICDM'12)

Link prediction is an important task in network analysis, benefiting researchers and organizations in a variety of fields. Many networks in the real world, for example social networks, are heterogeneous, having multiple types of links and complex dependency structures. In this paper, we introduce Multi-Relational Influence Propagation (MRIP), a novel probabilistic method for heterogeneous networks.

CONFERENCES Selected Yang Yang, Nitesh V. Chawla, Yizhou Sun, and Jiawei Han.
img

Predicting Links in Multi-Relational and Heterogeneous Networks

Yang Yang, Nitesh V. Chawla, Yizhou Sun, and Jiawei Han. CONFERENCES Selected

Link prediction is an important task in network analysis, benefiting researchers and organizations in a variety of fields. Many networks in the real world, for example social networks, are heterogeneous, having multiple types of links and complex dependency structures. Link prediction in such networks must model the influence propagating between heterogeneous relationships to achieve better link prediction performance than in homogeneous networks. In this paper, we introduce Multi-Relational Influence Propagation (MRIP), a novel probabilistic method for heterogeneous networks. We demonstrate that MRIP is useful for predicting links in sparse networks, which present a significant challenge due to the severe disproportion of the number of potential links to the number of real formed links. We also explore some factors that can inform the task of classification yet remain unexplored, such as temporal information. In this paper we make use of the temporal-related features by carefully investigating the issues of feasibility and generality. In accordance with our work in unsupervised learning, we further design an appropriate supervised approach in heterogeneous networks. Our experiments on co-authorship prediction demonstrate the effectiveness of our approach.

10 DEC 2012

Maximizing Information Spread Through Influence Structures in Social Networks

DaMNet Workshop, in Proc. of the 12th IEEE International Conference on Data Mining (ICDM'12)

Finding the most influential nodes in a network is a much discussed research topic of recent time in the area of network science, especially in social network analysis. We present a simple, yet scalable (polynomial time) algorithm that outperforms the existing state-of-the-art, and its success does not depend significantly on any kind of tuning parameter.

WORKSHOPS Saurav Pandit, Yang Yang, and Nitesh V. Chawla
img

Maximizing Information Spread Through Influence Structures in Social Networks

Saurav Pandit, Yang Yang, and Nitesh V. Chawla WORKSHOPS

Finding the most influential nodes in a network is a much discussed research topic of recent time in the area of network science, especially in social network analysis. The topic of this paper is a related, but harder problem. Given a social network where neighbors can influence each other, the problem is to identify k nodes such that if a piece of information is placed on each of those k nodes, the overall spread of that information (via word-of-mouth or other methods of influence flow) is maximized. The amount of information spread can be measured using existing information propagation models. Recent studies, which focus on how quickly k high influential nodes can be found, tend to ignore the overall effect of the information spread. On the other hand some legacy methods, which look at all possible propagation paths to compute a globally optimal target set, present severe scalability challenges in large-scale networks. We present a simple, yet scalable (polynomial time) algorithm that outperforms the existing state-of-the-art, and its success does not depend significantly on any kind of tuning parameter. To be more precise, when compared to the existing algorithms, the output set of k nodes produced by our algorithm facilitates higher information spread -- in almost all the instances, consistently across the commonly used information propagation models. The original algorithm in this paper, although scalable, can have higher running time than some standard approaches, e.g. simply picking the top k nodes with highest degree or highest PageRank value. To that end, we provide an optional speedup mechanism that considerably reduces the time complexity while not significantly affecting the quality of results vis-a-vis the full version of our algorithm.

29 MAY 2012

ALIVE: A Multi-Relational Link Prediction Environment for the Healthcare Domain

Third Workshop on Data Mining for Healthcare Management (PAKDD'12)

In this paper, we propose ALIVE, a multirelational link prediction and visualization environment for the healthcare domain. ALIVE combines novel link prediction methods with a simple user interface and intuitive visualization of data to enhance the decision-making process for healthcare professionals.

WORKSHOPS Reid Johnson, Yang Yang, Everaldo Aguiar, Andrew Rider, and Nitesh V.Chawla
img

ALIVE: A Multi-Relational Link Prediction Environment for the Healthcare Domain

Reid Johnson, Yang Yang, Everaldo Aguiar, Andrew Rider, and Nitesh V.Chawla WORKSHOPS

An underlying assumption of biomedical informatics is that decisions can be more informed when professionals are assisted by analytical systems. For this purpose, we propose ALIVE, a multirelational link prediction and visualization environment for the healthcare domain. ALIVE combines novel link prediction methods with a simple user interface and intuitive visualization of data to enhance the decision-making process for healthcare professionals. It also includes a novel link prediction algorithm, MRPF, which outperforms many comparable algorithms on multiple networks in the biomedical domain. ALIVE is one of the first attempts to provide an analytical and visual framework for healthcare analytics, promoting collaboration and sharing of data through ease of use and potential extensibility. We encourage the development of similar tools, which can assist in facilitating successful sharing, collaboration, and a vibrant online community.

29 APR 2012

Detecting Communities in Time-evolving Proximity Networks

IEEE Network Science Workshop (NSW'12)

In this paper, we introduce the notion of spatio-temporal communities that attempts to capture the structure in spatial connections as well as temporal changes in a network. We illustrate the notion via several examples and list the challenges in effectively discovering spatio-temporal communities. We present an approach that first extracts concurrency information via node-clustering on each snapshot.

WORKSHOPS Suarav Pandit, Yang Yang, Vikas Kawadia, Sameet Sreenivasan, and Nitesh V. Chawla
img

Detecting Communities in Time-evolving Proximity Networks

Suarav Pandit, Yang Yang, Vikas Kawadia, Sameet Sreenivasan, and Nitesh V. Chawla WORKSHOPS

The pattern of interactions between individuals in a population contains implicitly within them a remarkable amount of information. This information, if extracted, could be of significant importance in several realms such as containing the spread of disease, understanding information flow in social systems and predicting likely future interactions. A popular method of discovering structure in networks is through community detection which attempts to capture the extent to which that network is different from a random network. However, communities are not very well defined for time-varying networks. In this paper, we introduce the notion of spatio-temporal communities that attempts to capture the structure in spatial connections as well as temporal changes in a network. We illustrate the notion via several examples and list the challenges in effectively discovering spatio-temporal communities. For example, such communities are lost if the temporal interactions are aggregated in a single weighted network since the concurrency information is lost. We present an approach that first extracts concurrency information via node-clustering on each snapshot. Each node is then assigned a vector of community memberships over time, which is then used to group nodes into overlapping communities via recently introduced link clustering techniques. However we measure similarity (of nodes and edges) based on concurrence, i.e. when they existed, if they existed together. We call our approach the co-community algorithm. We validate our approach using several real-world data sets spanning multiple contexts.

25 JUL 2011

Is Objective Function the Silver Bullet? A Case Study of Community Detection Algorithms on Social Networks

The International Conference on Advances in Social Networks Analysis and Mining (ASONAM'11)

Community detection or cluster detection in networks is a well-studied, albeit hard, problem. Many kinds of metrics on community's qualities are introduced, especially objective functions such as modularity and internal density. Our work is aiming to answer whether these general used objective functions are well consistent with the real performance of community detection algorithm, while comparing performance of algorithms is not the purpose of our study.

Conferences Selected Yang Yang, Yizhou Sun, Saurav Pandit, Nitesh Chawla, and Jiawei Han
img

Is Objective Function the Silver Bullet? A Case Study of Community Detection Algorithms on Social Networks

Yang Yang, Yizhou Sun, Saurav Pandit, Nitesh Chawla, and Jiawei Han Conferences Selected

Community detection or cluster detection in networks is a well-studied, albeit hard, problem. Given the scale and complexity of modern day social networks, detecting "reasonable" communities is an even harder problem. Since the first use of k-means algorithm in 1960s, many community detection algorithms have been invented - most of which are developed with specific goals in mind and the idea of detecting "meaningful" communities varies widely from one algorithm to another. With the increasing number of community detection algorithms, many kinds of metrics on community's qualities are introduced, especially objective functions such as modularity and internal density. In this paper we divide methods of measurements into two categories, according to whether they rely on ground-truth or not. Our work is aiming to answer whether these general used objective functions are well consistent with the real performance of community detection algorithm, while comparing performance of algorithms is not the purpose of our study. Seven representative algorithms are compared under various performance metrics, and on various "real world" social networks.

.04

RESEARCH

RESEARCH AREAS

Link Prediction in Complex Networks

Link Prediction in Heterogeneous Networks and Evaluating Link Prediction Methods

Link Prediction in Heterogeneous Networks: Perform time series analysis to model temporal information in link prediction problems and combine these approaches with social influence analysis to estimate link likelihood in multi-relational and heterogeneous networks.

Evaluating Link Prediction Methods: Provide theoretical proofs and empirical examples demonstrating how most of current link prediction evaluations lead to questionable conclusions. Develop recommendations for consistent, standard, and applicable evaluation metrics for the link prediction problem. This shed new light on the link prediction problem and has high impact on the development of link analysis.
img

Social Network Analysis

Gender-specific Difference Analysis and Social Network Analysis in Anthropology Field

Traders Alliance Network Evolution in Unstable Economic Conditions: Investigate and model the relationship between traders' behavior and traders' network dynamics, and prevailing political and regulatory conditions. Provide theoretical understandings of economic group behaviors and for policies addressing the role and resilience of the emergent behaviors and institutions in unstable economic conditions.

Gender Difference Analysis of Human Communication Networks: Study gender-specific differences in a multi-dimensional social system, which contains cell phone contacts, instant messaging, traces of locations, log of app usages, log of music listening, and etc., casting new light on the control of gender-specific information spreading, in marketing, or for smooth implementation of online social networks.
img

Global Terrorism Network Analysis

Explore the mechanisms underlying the dynamics of global terrorisms

Terrorism System Behaviors: Current work on global terrorism networks have argued that interventions and predictions can be based on the finding that the severity of attacks follow a power law. Using a dramatically more comprehensive dataset on global terrorism than previously available, we find that the distribution of the severity of attacks does not conform to power laws distributions when looked at over time, 1970 to today. This suggests that governmental efforts and targeting practices should change given the nature of the group involved.

Predicting Groups Future Lethality: We also find that the randomness of group behaviors is highly correlated with their lethality (average kills per year). Our proposed metric of group behavioral randomness is demonstrated to be an effective predictor of group lethality. Causality inference analysis and time series analysis are applied, and the connection between randomness and lethality is confirmed significant. Notably, our analysis reveals that groups in the early stage of evolution display a distinctive and robust signature in lethality trend, adequately predicted by their randomness.
img

Social Network and Career Success

Explore the connection between social networks and career success

The Formation and Imprinting of Network Effects Success: The "business elite" constitutes a small but strikingly influential subset of the population, oftentimes affecting important societal outcomes. We analyze a unique dataset of all MBA students at a top 5 MBA program. We find that the distinguishing characteristics between students who do well in job placement and those who do not is their network. Further, we find that the network differences between the successful and unsuccessful students develops within the first month of class and persists thereafter, suggesting a network imprinting that is persistent.

Recipe for Career Success Differ for Men and Women: This study examines the mechanisms by which such gender imbalances persist. We use a quasi-experiment approach to analyze 4.5 million emails from MBA students and find that the network configurations associated with early career success are distinctly different for men and women. These "gender-specific" networks also predict whether men or women will exceed expectations with respect to the jobs they secure after graduation.
img
.05

SKILLS

PROGRAMMING SKIILLS
R Programming
LEVEL : ADVANCEDEXPERIENCE : 4 YEARS
Java Programming
LEVEL : ADVANCEDEXPERIENCE : 9 YEARS
SQL/Database
LEVEL : INTERMEDIATEEXPERIENCE : 3 YEARS
MATLAB
LEVEL : INTERMEDIATEEXPERIENCE : 4 YEARS
STATA
LEVEL : NOIVCEEXPERIENCE : 2 YEARS
RESEARCH SKILLS
Network Analysis
LEVEL : ADVANCEDEXPERIENCE : 5 YEARS
Link PredictionSocial Influence AnalysisSocial Network Modeling
Causality Inference
LEVEL : INTERMEDIATEEXPERIENCE : 5 YEARS
Coarsening Exact MatchingPropensity Score Stratification
Data Mining
LEVEL : ADVANCEDEXPERIENCE : 5 YEARS
Machine LearningImbalanced DataClustering/Classification
.06

CONTACT

Get in touch


We are waiting to assist you
Simply use the form below to get in touch

SEND MESSAGE