Peer-reviewed publications
Relation-weighted Link Prediction for Disease Gene Identification
Berk Kapicioglu*, Srivamshi Pittala*, William Koehler, Jonathan Deans, Daniel Salinas, Martin Bringmann, Katharina Sophia Volz
Neural Information Processing Systems (NeurIPS), 4th Knowledge Representation and Reasoning Meets Machine Learning Workshop (KR2ML), 2020.
Paper – Poster – Website
* Denotes equal contribution.
Identification of disease genes, which are a set of genes associated with a disease, plays an important role in understanding and curing diseases. In this paper, we present a biomedical knowledge graph designed specifically for this problem, propose a novel machine learning method that identifies disease genes on such graphs by leveraging recent advances in network biology and graph representation learning, study the effects of various relation types on prediction performance, and empirically demonstrate that our algorithms outperform its closest state-of-the-art competitor in disease gene identification by 24.1%. We also show that we achieve higher precision than Open Targets, the leading initiative for target identification, with respect to predicting drug targets in clinical trials for Parkinson’s disease.
Biomedical Information Extraction for Disease Gene Prioritization
Berk Kapicioglu*, Jupinder Parmar*, William Koehler*, Martin Bringmann, Katharina Sophia Volz
Neural Information Processing Systems (NeurIPS), 4th Knowledge Representation and Reasoning Meets Machine Learning Workshop (KR2ML), 2020.
Paper – Poster – Website
* Denotes equal contribution.
We introduce a biomedical information extraction (IE) pipeline that extracts biological relationships from text and demonstrate that its components, such as named entity recognition (NER) and relation extraction (RE), outperform state-of-the-art in BioNLP. We apply it to tens of millions of PubMed abstracts to extract protein-protein interactions (PPIs) and augment these extractions to a biomedical knowledge graph that already contains PPIs extracted from STRING, the leading structured PPI database. We show that, despite already containing PPIs from an established structured source, augmenting our own IE-based extractions to the graph allows us to predict novel disease-gene associations with a 20% relative increase in hit@30, an important step towards developing drug targets for uncured diseases.
Chess2vec: Learning Vector Representations for Chess
Berk Kapicioglu, Ramiz Iqbal, Tarik Koc, Louis Nicolas Andre, Katharina Sophia Volz
Neural Information Processing Systems (NeurIPS), Relational Representation Learning Workshop, 2018.
Long Paper – Short Paper – Poster – Spotlight – Website
We conduct the first study of its kind to generate and evaluate vector representations for chess pieces. In particular, we uncover the latent structure of chess pieces and moves, as well as predict chess moves from chess positions. We share preliminary results which anticipate our ongoing work on a neural network architecture that learns these embeddings directly from supervised feedback.
Tip Ranker: A Machine Learning Approach to Ranking Short Reviews
Enrique Cruz, Berk Kapicioglu
ACM Recommender Systems Conference (RecSys), Poster Proceedings, 2016.
Paper – Poster
Collaborative Place Models
Berk Kapicioglu, David S. Rosenberg, Robert E. Schapire, Tony Jebara
International Joint Conference on Artificial Intelligence (IJCAI), 2015.
Paper – Supplement 1 – Supplement 2 – Poster
A fundamental problem underlying location-based tasks is to construct a complete profile of users’ spatiotemporal patterns. In many real-world settings, the sparsity of location data makes it difficult to construct such a profile. As a remedy, we describe a Bayesian probabilistic graphical model, called Collaborative Place Model (CPM), which infers similarities across users to construct complete and time-dependent profiles of users’ whereabouts from unsupervised location data. We apply CPM to both sparse and dense datasets, and demonstrate how it both improves location prediction performance and provides new insights into users’ spatiotemporal patterns.
Collaborative Ranking for Local Preferences
Berk Kapicioglu, David S. Rosenberg, Robert E. Schapire, Tony Jebara
Artificial Intelligence and Statistics (AISTATS), 2014.
Paper – Supplement – Poster
For many collaborative ranking tasks, we have access to relative preferences among subsets of items, but not to global preferences among all items. To address this, we introduce a matrix factorization framework called Collaborative Local Ranking (CLR). We justify CLR by proving a bound on its generalization error, the first such bound for collaborative ranking that we know of. We then derive a simple alternating minimization algorithm and prove that its running time is independent of the number of training examples. We apply CLR to a novel venue recommendation task and demonstrate that it outperforms state-of-the-art collaborative ranking methods on real-world data sets.
Combining Spatial and Telemetric Features for Learning Animal Movement Models
Berk Kapicioglu, Robert E. Schapire, Martin Wikelski, Tamara Broderick
Uncertainty in Artificial Intelligence (UAI), 2010.
Paper – Poster
We introduce a new graphical model for tracking radio-tagged animals and learning their movement patterns. The model provides a principled way to combine radio telemetry data with an arbitrary set of user-defined, spatial features. We describe an efficient stochastic gradient algorithm for fitting model parameters to data and demonstrate its effectiveness via asymptotic analysis and synthetic experiments. We also apply our model to real datasets, and show that it outperforms the most popular radio telemetry software package used in ecology. We conclude that integration of different data sources under a single statistical framework, coupled with appropriate parameter and state estimation procedures, produces both accurate location estimates and an interpretable statistical model of animal movement.
Agent-Based Modeling of the Evolution of Vowel Harmony
K. David Harrison, Mark Dras, Berk Kapicioglu
North East Linguistic Society (NELS), 2002.
Paper
Ph.D. thesis
Applications of Machine Learning to Location Data
Berk Kapicioglu
Princeton University, Department of Computer Science, 2013.
PDF
Miscellaneous
Mobile Phone-Derived Features for Stop Detection
Daniel Kronovet, Berk Kapicioglu, Lauren Hannah
Columbia University Data Science Institute, Poster Competition, Finalist, 2016.
Poster
Right time, right place: A collaborative approach for accurate context-awareness in mobile apps and ads
Berk Kapicioglu, David S. Rosenberg, Robert E. Schapire, Tony Jebara
Columbia University Newsletter, 2015.
Website
Place Models for Sparse Location Prediction
Berk Kapicioglu, David S. Rosenberg, Robert E. Schapire, Tony Jebara
New York Academy of Sciences (NYAS), Machine Learning Symposium, 2012.
Place Recommendation with Implicit Spatial Feedback
Berk Kapicioglu, David S. Rosenberg, Robert E. Schapire, Tony Jebara
New York Academy of Sciences (NYAS), Machine Learning Symposium, 2011.
Learning Animal Movement Models and Location Estimates Using HMMs
Berk Kapicioglu, Robert E. Schapire, Martin Wikelski, Tamara Broderick
Neural Information Processing Systems (NeurIPS), Stochastic Models of Behaviour Workshop, 2008.
Learning Animal Movement Models and Location Estimates Using HMMs
Berk Kapicioglu, Robert E. Schapire, Martin Wikelski, Tamara Broderick
New York Academy of Sciences (NYAS), Machine Learning Symposium, 2008.
Patents
Passive Visit Detection
Stephanie Yang, Daniel Kronovet, Lauren Hannah, Berk Kapicioglu
Publication Number: WO2018053330 A1.
Website
Venue Detection
Berk Kapicioglu, Enrique Cruz, Aaron Mitchell, Stephanie Yang
Publication Number: US20180080793 A1.
Website
Venue Identification from Wireless Scan Data
Berk Kapicioglu and Blake Shaw
Publication Number: US20160295372 A1.
Website
Venue Prediction Based on Ranking
Berk Kapicioglu and David Rosenberg
Publication Number: US20130325855 A1.
Website
Method for Analyzing and Ranking Venues
Berk Kapicioglu and David Rosenberg
Publication Number: US20130325746 A1.
Website
Collaborators
- Robert E. Schapire (Princeton University, Microsoft Research)
- David S. Rosenberg (Bloomberg, NYU Center for Data Science)
- Tony Jebara (Columbia University, Spotify)
- Tamara Broderick (MIT)