Seminar:
TALK: How users evaluate things and each other in social media
Speaker: Jure Leskovec
Speaker Affiliation: Stanford University
Host: Sham Kakade
Host Affiliation: Microsoft Research New England
Date: Wednesday, September 5, 2012.
Time: 4:00 PM - 5:00 PM
Location:
Microsoft Research New England.
First Floor Conference Center.
One Memorial Drive, Cambridge, MA.
In a variety of domains, mechanisms for evaluation allow one user to
say whether he or she trusts another user, or likes the content they
produced, or wants to confer special levels of authority or
responsibility on them. We investigate a number of fundamental ways in
which user and item characteristics affect the evaluations in online
settings. For example, evaluations are not unidimensional but include
multiple aspects that all together contribute to user's overall
rating. We investigate methods for modeling attitudes and attributes
from online reviews that help us better understand user's individual
preferences. We also examine how to create a composite description of
evaluations that accurately reflects some type of cumulative opinion
of a community. Natural applications of these investigations include
predicting the evaluation outcomes based on user characteristics and
to estimate the chance of a favorable overall evaluation from a group
knowing only the attributes of the group's members, but not their
expressed opinions
Jure Leskovec is assistant professor of Computer Science at Stanford University where he is a member of the Info Lab and the AI Lab.
His research focuses on mining large social and information networks.
Problems he investigates are motivated by large scale data, the Web and on-line media.
This research has won several awards including best paper awards at KDD (2005, 2007, 2010), WSDM (2011), ICDM (2011) and
ASCE J. of Water Resources Planning and Management (2009), ACM KDD dissertation award (2009), Microsoft Research Faculty Fellowship (2011),
Alfred P. Sloan Fellowship (2012) and NSF Early Career Development (CAREER) Award (2011).
He received his bachelor's degree in computer science from University of Ljubljana, Slovenia,
Ph.D. in machine learning from the Carnegie Mellon University and postdoctoral training from Cornell University.
You can follow him on Twitter @jure
Upcoming Seminar:
TALK: Theoretical and Algorithmic Foundations of Online Learning
Speaker: Sasha Rakhlin
Speaker Affiliation: University of Pennsylvania
Host: Tomaso Poggio, Lorenzo Rosasco
Host Affiliation: Laboratory for Computational and Statistical Learning, MIT-IIT
Date: Wednesday, October 17th, 2012.
Time: 4:00 PM - 5:00 PM
Location:
McGovern Seminar Room, MIT 46-3189
Within the framework of sequential prediction (online learning), data arrive in a stream, and the learner is tasked with making a sequence of decisions. Such a basic scenario has been studied in Information Theory, Decision Theory, Game Theory, Statistics, and Machine Learning. The learning protocol and the non-i.i.d. (or even adversarial) nature of observed data constitute a big departure from the well-studied setting of Statistical Learning Theory. In the latter, many important tools and complexity notions of the hypothesis class have been developed, starting with the pioneering work of Vapnik and Chervonenkis. In contrast, the theoretical understanding of online learning has been lacking, as most results are obtained on a case-by-case basis.
In this talk, we first focus on no-regret online learning and develop the relevant notions of complexity in a surprising parallel to Statistical Learning Theory. We characterize online learnability through finiteness of the sequential versions of combinatorial dimensions, random averages, and covering numbers. This non-constructive study of inherent complexity is then augmented with a recipe for developing online learning algorithms via a notion of a relaxation. To demonstrate the utility of our approach, we develop a new family of randomized methods and new algorithms for the matrix completion problem. We then discuss extensions of our techniques beyond no-regret learning, including Blackwell approachability and calibration of forecasters. Finally, we present open problems and directions of further research.
Sasha Rakhlin is Assistant Professor at the Department of Statistics
University of Pennsylvania, The Wharton School.
Colloquium:
TALK: From Rosenblatt's learning model to
the model of learning with nontrivial teacher.
Speaker: Vladimir Vapnik
Speaker Affiliation: Royal Holloway University of London
Host: Tomaso Poggio, Lorenzo Rosasco
Host Affiliation: Laboratory for Computational and Statistical Learning,
MIT-IIT
Date: September 26th, 2012
Time: 4:00 PM - 5:00 PM
Location: MIT Bldg 46-3002 Singleton Auditorium
Vladimir Naumovich Vapnik is one of the main developers of the
Vapnik-Chervonenkis theory. He received his master's degree in mathematics at
the Uzbek State University, Samarkand, Uzbek SSR in 1958 and Ph.D in statistics
at the Institute of Control Sciences, Moscow in 1964. He worked at this
institute from 1961 to 1990 and became Head of the Computer Science Research
Department. At the end of 1990, he moved to the USA and joined the Adaptive
Systems Research Department at AT&T Bell Labs in Holmdel, New Jersey. The
group later became the Image Processing Research Department of AT&T
Laboratories when AT&T spun off Lucent Technologies in 1996. Vapnik Left
AT&T in 2002 and joined NEC Laboratories in Princeton, New Jersey, where he
currently works in the Machine Learning group. He also holds a Professor of
Computer Science and Statistics position at Royal Holloway, University of
London since 1995, as well as a position as Professor of Computer Science at
Columbia University, New York City since 2003. He was inducted into the U.S.
National Academy of Engineering in 2006. He received the 2005 Gabor Award, the
2008 Paris Kanellakis Award, the 2010 Neural Networks Pioneer Award, the 2012
IEEE Frank Rosenblatt Award, and the 2012 Benjamin Franklin Medal in Computer
and Cognitive Science.
Upcoming Colloquium:
TALK: Some Mathematics of Immunology and of Protein Folding.
Speaker: Steve Smale
Speaker Affiliation: City University of Hong Kong
Host: Tomaso Poggio, Lorenzo Rosasco
Host Affiliation: Laboratory for Computational and Statistical Learning,
MIT-IIT
Date: November 28th, 2012
Time: 4:00 PM - 5:30 PM
Location: MIT Bldg 46-3002 Singleton Auditorium
A geometrical picture of certain current activities in biology will be presented.
In particular proposals for "good kernels" will be suggested for sets of amino acid strings as well as sequences of Ramachandran angle pairs.
Stephen Smale is an American mathematician from Flint, Michigan.
He was awarded the Fields Medal in 1966, and spent more than three decades on the mathematics faculty of the University of California, Berkeley (1960-1961 and 1964-1995).
Since 2002 Smale is a Professor at the Toyota Technological Institute at Chicago; starting August 1, 2009, he is also a Distinguished University Professor at the City University of Hong Kong.
In 2007, Smale was awarded the Wolf Prize in mathematics.
Seminar:
TALK: Deep Architectures and Deep Learning: Theory, Algorithms, and Applications.
Speaker: Pierre Baldi
Speaker Affiliation: University of California, Irvine
Host: Tomaso Poggio, Lorenzo Rosasco
Host Affiliation: Laboratory for Computational and Statistical Learning, MIT-IIT
Date: December 13th, 2012
Time: 3:30 PM - 4:30 PM
Location: MIT Bldg 32 Seminar Room G449 (Patil/ Kiva)
Deep architectures are important for machine learning, for engineering applications, and for
understanding the brain. In this talk, we will provide a brief historical overview of deep architectures
from their 1950s origins to today. Motivated by this overview, we will study and prove several theorems regarding
deep architectures and one of their main ingredients--autoencoder circuits--in particular in the unrestricted Boolean
and unrestricted probabilistic cases. We will show how these analyses lead to a new family of learning algorithms
for deep architectures--the deep target (DT) algorithms. The DT approach converts the problem of learning
a deep architecture into the problem of learning many shallow architectures by providing learning targets
for each deep layer. Finally, we will present simulation results and applications of deep architectures
and DT algorithms to the protein structure prediction problem.
Pierre Baldi is Chancellor's Professor in the Department of Computer Science and Director of the Institute for Genomics and
Bioinformatics and Associate Director of the Center for Machine Learning and Intelligent Systems at the University of California, Irvine. He received his PHD degree from the California Institute of Technology. His research work is at the interface of the computational and life sciences, in particular the application of artificial intelligence and statistical machine learning methods to problems in chemoinformatics, genomics, systems biology, and computational neuroscience. He is credited with pioneering the use of Hidden Markov Models (HMMs), graphical models, and recursive neural networks in bioinformatics. His group has developed widely used databases, software, and web servers including the ChemDB database and chemoinformatics portal for the prediction of molecular properties and applications in chemical synthesis and drug discovery, the SCRATCH suite for protein structure prediction, the Cyber-T program for the differential analysis of gene expression data using Bayesian statistical methods, the MotifMap system for charting transcription factor binding sites on a genome-wide scale, and the CRICK expert system for analyzing molecular networks and pathways in healthy and disease systems.
Dr. Baldi has published over 260 peer-reviewed research articles and four books. He is the recipient of the 1993 Lew Allen Award, the 1999 Laurel Wilkening Faculty Innovation Award, a 2006 Microsoft Research Award, and the 2010 E. R. Caianiello Prize for research in machine learning. He is also a Fellow of the Association for the Advancement of Science (AAAS), the Association for the Advancement of Artificial Intelligence (AAAI), and the Institute of Electrical and Electronics Engineers (IEEE). Through his consulting company IDLAB, Inc. he has consulted for government agencies, publishers, and several companies in the information technology and biotechnology industries.
He was co-founder and CEO of Net-ID, Inc. in the 1990s, a company focused on the application of machine learning methods to fingerprint
recognition and bioinformatics. More recently, he co-founded Reaction Explorer, a company developing interactive expert
systems for chemical education and Group IV Biosystems, a synthetic biology company.
Seminar:
TALK: Transportation Distances and their Application in Machine Learning: New Problems
Speaker: Marco Cuturi
Speaker Affiliation: University of Kyoto
Host: Tomaso Poggio, Lorenzo Rosasco
Host Affiliation: Laboratory for Computational and Statistical Learning, MIT-IIT
Date: December 12th, 2012
Time: 4:00 PM - 5:00 PM
Location: MIT Bldg 32 Seminar Room G449 (Patil/ Kiva)
MC got his PhD in 2005 at the Ecole des Mines de Paris, under the supervision of JP Vert. He has worked in the
Institute of Statistical Mathematics in Tokyo and in Princeton University, and is now Associate Professor at Kyoto University.
I will present in this talk two new research topics related
to the optimal transportation distance (also known as Earth
Mover's or Wasserstein) and its application in machine
learning to compare histograms of features.
I discuss first the ground metric learning problem, which is
the problem of tuning automatically the parameters of
transportation distances using labeled histogram data. After
providing some reminders on optimal transportation, I will
argue that learning transportation distances is akin to
learning an L1 distance on the simplex, namely a distance
with polyhedral level sets, and I will draw some parallels
with Mahalanobis distances, the L2 distance and elliptic
level sets. I will then introduce our algorithm
(arXiv:1110.2306) and more recent extensions.
In the second part of my talk, I address the fact that
transportation distances are not Hilbertian by showing that
they can be cast as positive definite kernels through the
"generating function trick". We prove that the trick, which
uses the generating function of the transportation polytope
to define a similarity - rather than focusing exclusively on
the optimal transport to define a distance - leads to a
positive definite kernel between histograms
(arXiv:1209.2655).
Seminar:
TALK: Regularized Learning in Reproducing Kernel Banach Spaces.
Speaker: Jun Zhang
Speaker Affiliation: Department of Psychology. University of Michigan, Ann Arbor
Host: Tomaso Poggio, Lorenzo Rosasco
Host Affiliation: Laboratory for Computational and Statistical Learning, MIT-IIT
Date: Wed. April 17th, 2013
Time: 4:00 PM - 5:00 PM
Location: MIT Bldg 42 Seminar Room G449 (Patil/ Kiva)
Regularized learning is the contemporary framework for learning to generalize from
finite samples (classification, regression, clustering, etc). Here the
problem is to
learn an input-outputmappingf: X->Y given finite samples {(xi, yi),
i=1,...,N}. With
minimal structural assumptions on X, the class of functions under
consideration is
assumed to fall under a Banachspace of functions B. The learning-from-data
problem is then formulated as an optimization problem in such a function
space,
with the desired mapping as an optimizer to be sought, where the objective
function
consists of a loss term L(f) capturing itsgoodness-of-fit (or the lack
thereof) on given
samples {(f(xi), yi), i=1,...,N}, and a penalty term R(f) capturing its
complexity based
on prior knowledge about the solution (smoothness,sparsity, etc). This
second,
regularizing term is often taken to be the norm of B, or a transformation
Phi
thereof:
R(f) = Phi(|f|). This program has been successfully carried out for the
Hilbert space
of functions, resulting in the celebrated Reproducing Kernel Hilbert Space
methods
in machine learning. Here, we will remove the Hilbert space restriction,
i.e., the
existence of an inner product, and show that the key ingredients of this
framework
(reproducing kernel, representer theorem, feature space) remain to hold for
a
Banach space that is uniformly convex and uniformly Frechet differentiable.
Central
to our development is the use of a semi-inner product operator and duality
mapping
for a uniform Banach space in place of an inner-product for a Hilbert space.
This
opens up the possibility of unifying kernel-based methods (regularizing
L2-norm)
and sparsity-based methods (regularizing l1-norm), which have so far been
investigated under different theoretical foundations.
Dr. Jun Zhang is a Professor of Psychology at the University of Michigan,
Ann Arbor. He
received the B.Sc. degree in Theoretical Physics from Fudan University in
1985, and Ph.D.
degree in Neurobiology from the University of California, Berkeley in 1992.
He has also held
visiting positions at the University of Melbourne, the University of
Waterloo, and RIKEN Brain
Science Institute. During 2007-2010, he worked as the Program Manager for
the U.S. Air Force
Office of Scientific Research (AFOSR) in charge of the basic research
portfolio for Cognition and
Decision in the Directorate of Mathematics, Information, and Life Sciences.
Dr. Zhangserved as
the President for the Society for Mathematical Psychology (SMP) and serves
on the Federation of
Associations in Brain and Behavioral Sciences (FABBS). He is Associate
Editor for the Journal
of Mathematical Psychology, anda Fellow of the Association for Psychology
Sciences (APS). Dr.
Zhang has published about 50 peer-reviewed journal papers in the field of
vision, mathematical
psychology, cognitive psychology, cognitive neuroscience, game theory,
machine learning, etc.
His research has been funded by the National Science Foundation (NSF), Air
Force Office for
Scientific Research (AFOSR), and Army Research Office (ARO).
Seminar:
TALK: Causal Inference and Anticausal Learning.
Speaker: Bernhard Schölkopf
Speaker Affiliation: Max Planck Institute for Intelligent Systems, Tubingen, Germany
Host: Tomaso Poggio, Lorenzo Rosasco
Host Affiliation: Laboratory for Computational and Statistical Learning, MIT-IIT
Date: Fri. June 7th, 2013
Time: Starting 3:30 pm
Location: MIT Bldg 46 Singleton Auditorium
Causal inference is an intriguing field examining causal structures by testing their statistical footprints. The talk introduces the main ideas of causal inference from the point of view of machine learning, and discusses implications of underlying causal structures for popular machine learning scenarios such as covariate shift and semi-supervised learning. It argues that causal knowledge may facilitate some approaches for a given problem, and rule out others.
-
Calendar
-
Organizers
The seminar series is organized by:
The colloquium series is coordinated by
Cynthia Rudin.