Department of Computer Science and Automation Department of Computer Science and Automation, IISc, Bangalore, India Indian Institute of Science
HOME | ABOUT US | PEOPLE | RESEARCH | ACADEMICS | FACILITIES | EVENTS / SEMINARS | NEWS | CONTACT US

UPCOMING SEMINARS

Series: Ph.D. Thesis Defense
Title: Non-Parametric Clustering of Multivariate Count Data

  • Speaker: Ms. Lavanya Sita Tekumalla
  • Faculty Advisor: Prof. Chiranjib Bhattacharyya
  • Date and Time: Monday, March 13, 2017, 9:30 AM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
While there has been significant work in Bayesian non-parametric modeling in the last decade there has been much less work on non-parametric clustering of Multivariate Count Data. The first contribution of this thesis addresses extensions to clustering with Poisson based models for multivariate counts. While the Poisson is the most popular distribution for count modeling, the Multivariate Poisson, often leads to intractable inference and a sub-optimal fit of the data. To address this, we introduce a family of models based on the Sparse-Multivariate Poisson, that exploit the inherent sparsity in multivariate data, leading to a better fit and more efficient inference. We explore Dirichlet process mixture model extensions and temporal non-parametric extensions to models based on the Sparse Multivariate Poisson for practical use of Poisson based models for non-parametric clustering of multivariate counts. A second contribution of this thesis addresses moving beyond the limitations of Poisson based models for non-parametric clustering, for instance in handling over dispersed data or data with negative correlations. We explore, for the first time, marginal independent inference techniques based on the Gaussian Copula for multivariate count data in the Dirichlet Process mixture model setting. This enables non-parametric clustering of multivariate counts without limiting assumptions that usually restrict the marginals to belong to a particular family. This inference technique can also work for mixed data (a combination of counts, binary and continuous data) enabling Bayesian non-parametric modeling to be used for a wide variety of data types. The third contribution of this thesis addresses modeling a wide range of more complex dependencies such as asymmetric and tail dependencies with Vine Copula based Dirichlet process mixtures. While vine copula inference has been well explored for continuous data, it is still a topic of active research for multivariate counts and mixed multivariate data. An efficient marginal independent inference approach based on extended rank likelihood, is proposed in this thesis, extending the use vines for multivariate counts and mixed data in practical clustering scenarios.

This thesis also explores the novel systems application of Bulk Cache Preloading by analyzing I/O traces though predictive models for temporal non-parametric clustering of multivariate count data. State of the art techniques in the caching domain are limited to exploiting short-range correlations in memory accesses at the milli-second granularity or smaller. We explore for the first time, Bulk Cache Preloading, the process of pro-actively predicting data to load into cache, minutes or hours before the actual request from the application, by leveraging longer range correlation at the granularity of minutes or hours. Our approach involves data aggregation, converting I/O traces into a temporal sequence of multivariate counts, that we analyze with the temporal non-parametric clustering models.

As an additional contribution, this thesis addresses multi-level non-parametric admixture modeling for discrete data in the form of grouped categorical data. Non-parametric topic models for document collections, is well explored with admixture models such as the Hierarchical Dirichlet Process. However, there exist scenarios, where a document requires being associated with themes at multiple levels, where each theme is itself an admixture over themes at the previous level, motivating the need for multilevel admixtures. We propose the nested Hierarchical Dirichlet Process to address this problem and apply a two level version of our model for non-parametric entity-topic modeling to automatically learn entities in the form of authors and topics from research corpora.

Video Archive Go Top

 

Series: Department Seminar
Title: Intelligent Inclusive Interaction Design

  • Speaker: Pradipta Biswas
                   CPDM, IISc
  • Date and Time: Friday, March 03, 2017, 11:00 AM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
The Intelligent Inclusive Interaction Design (I3D) Lab at CPDM undertakes research in the field of human computer interaction, intelligent user interfaces and inclusive design. Our present research is investigating new modalities of interaction and in particular eye gaze controlled interfaces. We are developing new target prediction and multimodal fusion algorithms to improve accuracy and response times of gaze controlled interfaces facilitating human machine interaction in automotive and military aviation environments and working on detecting users’ cognitive load by analysing their ocular parameters. We also actively pursue research on Assistive Technology for spastic children and on body posture tracking for developing smart manufacturing system. This talk in particular will present underlying theories and video demonstrations of our work on Inclusive User Modelling and Multimodal Gaze Controlled systems.

Speaker Bio:
Pradipta Biswas is an Assistant Professor at the Centre for Product Design and Manufacturing of Indian Institute of Science. His research focuses on user modelling and multimodal human-machine interaction for aviation and automotive environments and for assistive technology. He set up and leads the Intelligent Inclusive Interaction Design (I3D) Lab at CPDM, IISc. Earlier, he was a Senior Research Associate at Engineering Department, Research Fellow at Wolfson College and Research Associate at Trinity Hall of University of Cambridge. He completed PhD in Computer Science at the Rainbow Group of University of Cambridge Computer Laboratory and Trinity College in 2010 and was awarded a Gates-Cambridge Scholarship in 2006. He also conducted a course on Human Computer Interaction at Indian Institute of Technology, Mandi, guest lectured at Indian Institute of Technology, Madras and was a vice chairman of ITU-T Focus Group on Smart TV. Besides academics, he was a keen rifle shooter and won gold medals in Imperial Shooting Meet from 2010 to 2015.

Host Faculty: Prof. Vijay Natarajan

Video Archive Go Top

 

PAST SEMINARS

Series: M.Sc. (Engg) Colloquium
Title: Efficient Whole Program Path Tracing

  • Speaker: Sridhar Gopinath
  • Faculty Advisor: Dr. Murali Krishna Ramanathan
  • Date and Time: Thursday, February 16, 2017, 4:00 PM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
A whole program path (WPP) captures the runtime behavior of a program in terms of computing the control flow trace of the execution. Obtaining WPP has multiple benefits that include identifying hot paths, code coverage analysis, bug detection and reproduction. Existing techniques to compute WPPs are expensive due to the sub-optimal insertion of instrumentation resulting in the execution incurring significant space and time overheads. This necessitates the design of an efficient instrumentation strategy to obtain WPPs. More specifically, the goal is to insert minimum instrumentation in the program to generate partial traces which can be used to reconstruct any path precisely.

In this thesis, we design a novel and scalable whole program analysis to achieve the aforementioned goal. Our approach is inspired by a simple observation that for any pair of paths in the program, instrumenting one edge is sufficient to distinguish them. Performing this for all pairs of paths and identifying the minimum number of edges to instrument reduces to the minimum hitting set problem. Because collecting all pairs of paths is practically infeasible, we propose efficient strategies that avoid collecting the paths explicitly. We employ approximations to overcome the NP-hardness of identifying the minimum instrumentation points. Our techniques are designed to ensure that these approximations do not affect the precise reconstruction of the paths.

We have implemented our approach and performed elaborate evaluation on Java programs. Our experimental results on the DaCapo benchmark suite demonstrates the efficacy of our approach across multiple dimensions. On average, our approach instruments 9.5% of the overall number of CFG edges in the program. The average runtime overhead to generate WPPs using our approach is 1.97x. More importantly, compared to the state-of-the-art in WPP computation, we observe an average and maximum reduction in runtime overheads by a factor of 2.8 and 5.4 respectively. Further, our evaluation shows that our implementation is able to reconstruct WPPs precisely from the obtained traces.

Speaker Bio:
Sridhar Gopinath is a MSc (Engg) student in the Department of Computer Science and Automation since Aug 2015. His research interests are in software engineering and programming languages. He is a member of the Scalable Software Systems lab in CSA. He received a B.E in Computer Science from SJCE, Mysore in 2015.

Video Archive Go Top

 

Series: Department Seminar
Title: Progress in Error-Correction: A Survey

  • Speaker: Prof. Venkatesan Guruswami
                   CMU, USA
  • Date and Time: Friday, February 10, 2017, 3:00 PM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
Error-correcting codes play a crucial role in safeguarding data against the adverse effects of noise during communication and storage. They are also powerful tools that underlie several recent advances in theoretical computer science and combinatorics. The central challenge in coding theory is to construct codes with minimum possible redundancy for different error models and requirements on the decoder, along with efficient algorithms for error-correction using those codes. Much progress has been made toward this quest in the nearly seven decades since the birth of coding theory. Several fundamental problems, however, are still outstanding, and exciting new directions continue to emerge to address current technological demands as well as applications in computational complexity and cryptography. This talk will survey some of our recent works on error-correction in various models, such as:

- worst-case errors, where we construct list decodable codes with redundancy as small as the target error fraction; - i.i.d. errors, where we show polar codes enable efficient error-correction even as the redundancy approaches Shannon capacity; - bit deletions, where we give codes that can correct the largest known fraction of deletions; - single symbol erasure, a model of renewed importance for tackling node failures in distributed storage, where we give novel repair algorithms for Reed-Solomon codes as well as simple new codes with low-bandwidth repair mechanisms.

Host Faculty: Dr. Anand Louis

Video Archive Go Top

 

Series: Department Seminar
Title: Algorithmic and optimization aspects of Brascamp-Lieb inequalities

  • Speaker: Dr. Ankit Garg, Microsoft Research New England
  • Date and Time: Friday, February 10, 2017, 11:00 AM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
The celebrated Brascamp-Lieb (BL) inequalities (and their extensions) are an important mathematical tool, unifying and generalizing numerous inequalities in analysis, convex geometry and information theory. While their structural theory is very well understood, far less is known about computing their main parameters. We give polynomial time algorithms to compute: - Feasibility of BL-datum - The optimal BL-constant - A weak separation oracle for the BL-polytope

The algorithms are obtained by a simple efficient reduction of a given BL-datum to an instance of the Operator Scaling problem defined by Gurvits, for which the present authors provided a polynomial time algorithm recently.

If time permits, I will describe the common problem unifying BL inequalities, operator scaling and many such problems: testing the Null-Cone (common zero set of all invariant polynomials) of a group action on a vector space. A polynomial time algorithm for the Null-Cone problems remains an interesting open problem.

This will be based on a joint work with Leonid Gurvits, Rafael Oliveira and Avi Wigderson and another ongoing work with Peter Burgisser, Rafael Oliveira, Michael Walter and Avi Wigderson.

Speaker Bio:
Ankit Garg is a Postdoctoral researcher at Microsoft Research New England. Before that he got his PhD in Computer Science from Princeton University under the guidance of Prof. Mark Braverman and his undergraduate degree from IIT Delhi. His research interests include communication complexity, information complexity, algebraic complexity, and more broadly, computational complexity and theoretical computer science.

Host Faculty: Dr. Chandan Saha

Video Archive Go Top

 

Series: Department Seminar
Title: Parameterized Complexity of Biclique cover and Partition

  • Speaker: Davis Issac
                   Max Planck Institute for Informatik
                   Saarbruecken
                   Germany
  • Date and Time: Wednesday, February 01, 2017, 3:00 PM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
I will talk about my recent paper titled "On the parameterized complexity of biclique cover and Partition" together with Sunil Chandran and Andreas Karrenbauer published at IPEC 2016 . The problems that I will talk about are biclique cover and biclique partition. In biclique cover, we are given a bipartite graph and an integer $k$, and we want to find whether we can cover all the edges of the graph with at most $k$ compete bipartite graphs. In biclique partition, the only difference is that we want to partition the edges instead of covering it.. We come up with an algorithm with $O(2^{k^2})poly(n)$ runnning time for biclique partition whereas the best previously known fixed parameter running time for the problem was $O(2^{2^k}poly(n)$. Our algorithm makes use of a linear algebraic formulation of the problem and uses linear algebraic techniques. For biclique cover, we show that it is not possible to get running times that are better than doubly exponential in terms of k assuming the Exponential time hypothesis. We also show that there is no sub-exponential kernel in terms of $k$ for biclique cover unless $P=NP$. We achieve these two results by giving a reduction from 3SAT on $n$ variables to an instance of biclique cover with $k=O(log n)$.

Host Faculty: Prof. Sunil Chandran. L

Video Archive Go Top

 

Series: M.Sc. (Engg) Colloquium
Title: A Referral Reward Embedded, Biphase Information Diffusion Technique

  • Speaker: Ms. Sneha Mondal
                   M.Sc. (Engg) Student, CSA, IISc
  • Faculty Advisor: Prof. Y. Narahari
  • Date and Time: Monday, January 30, 2017, 12:00 PM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
The problem of maximizing the spread of influence over a social network is of interest to a variety of applications including marketing of a new product, popularizing a new technology or discovery, or designing a social campaign.

A social network is typically modeled as a graph, with nodes denoting agents and edges denoting relationships among agents. Influence propagates along the edges of the graph. Given a social network graph, an initial budget K, and an influence diffusion model, the influence-maximization problem is to determine K seed nodes such that under the given diffusion model, the number of agents eventually influenced by the seeds is maximized. Most approaches to this problem, in the existing literature, expend the entire available budget in triggering diffusion at seed nodes, which serve as early adopters of a new product or technology and encourage its adoption via a word-of-mouth campaign​.

The common underlying assumption in all conventional batch-seeding models is that agents have a standard Bayesian utility model, and that an active agent will influence its neighbours with a probability dictated by a stochastic process. Motivated by real-world considerations, we propose a significant departure from the above conventional method by explicitly considering the possibility of incentivizing active nodes by rewarding them for successful referrals as well. In particular, we investigate the effect of splitting the overall seed-budget across two sequential phases: In the first phase, we use a carefully computed fraction of the overall budget to trigger diffusion at a selected set of seed nodes. In the second phase, we use the remaining budget to offer referral incentives.

Adopting the well known independent cascade diffusion model, we formulate an appropriate objective function for the above biphase diffusion problem, and investigate its mathematical properties. We carry out experiments on synthetic as well as real-world datasets, and observe that under mild budget and temporal restrictions, an optimal split of the overall budget yields a decisive improvement in influence spread over the conventional single phase method.

Video Archive Go Top

 

Series: Ph.D. Thesis Defense
Title: Targeted Client Synthesis for Detecting Concurrency Bugs

  • Speaker: Ms. Malavika Samak
                   PhD student at CSA
  • Faculty Advisor: Dr. Murali Krishna Ramanathan
  • Date and Time: Monday, January 30, 2017, 10:00 AM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
Detecting concurrency bugs can be challenging due to the intricacies associated with their manifestation. These intricacies correspond to identifying the methods that need to be invoked concurrently, the inputs passed to these methods and the interleaving of the threads that cause the erroneous behavior. Neither fuzzing-based testing techniques nor over-approximate static analyses are well positioned to detect subtle concurrency defects while retaining high accuracy alongside satisfactory coverage. While dynamic analysis techniques have been proposed to overcome some of the challenges in detecting concurrency bugs, we observe that their success is critically dependent on the availability of effective multithreaded clients. Without a prior knowledge of the defects, manually constructing defect-revealing multithreaded clients is non-trivial.

In this talk, I will present our approach that addresses the problem of generating effective multithreaded clients. Our technique analyzes the execution traces obtained from executing random sequential clients and produces a concurrent client program that drives shared objects via library methods calls to states conducive for triggering deadlocks, data races, atomicity violations or assertion violations. We have implemented prototypes that incorporate the aforementioned ideas and applied them on 29 well-tested concurrent classes from popular Java libraries, including the latest version of JDK. We are able to automatically synthesize clients that helped expose more than 300 concurrency bugs. We reported many previously unknown bugs to the developers of these libraries resulting in either fixes to the code or changes to the documentation pertaining to the thread-safe behavior of the relevant classes.

Speaker Bio:
Malavika Samak is a PhD student in the Department of Computer Science and Automation at the Indian Institute of Science, Bangalore since Aug 2012. Her research interests are broadly in the areas of programming languages and software engineering. She is a recipient of the Google India PhD Fellowship and has served on the Artifact Evaluation Committees for PLDI, OOPSLA, POPL and PPoPP-CGO. She graduated with a B.E degree from SJCE, Mysore, India.

Video Archive Go Top

 

Series: Department Seminar
Title: DeepFix: Fixing common C language errors by deep learning

  • Speaker: Mr. Rahul Gupta
                   Ph.D student
  • Faculty Advisor: Prof. Aditya Sunil Kanade
  • Date and Time: Wednesday, January 25, 2017, 11:00 AM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
The problem of automatically fixing programming errors is a very active research topic in software engineering. This is a challenging problem as fixing even a single error may require analysis of the entire program. In practice, a number of errors arise due to programmers' inexperience with the programming language or lack of attention to detail. We call these common programming errors. These are analogous to grammatical errors in natural languages. Compilers detect such errors, but their error messages are usually inaccurate.

In this work, we present an end-to-end solution, called DeepFix, that can fix multiple such errors in a program without relying on any external tool to locate or fix them. At the heart of DeepFix is a multi-layered sequence-to-sequence neural network with attention which is trained to predict erroneous program locations along with the required correct statements.

On a set of 6971 erroneous C programs written by students for 93 programming tasks, DeepFix could fix 1881 (27%) programs completely and 1338 (19%) programs partially.

Note: This is a practice talk for AAAI 2017.

Video Archive Go Top

 

Series: M.Sc. (Engg) Thesis Defense
Title: A Case for Protecting Huge Pages from the Kernel

  • Speaker: Mr. Patel Naman Jagdishchandra
                   M.Sc (Engg) student of CSA
  • Faculty Advisor: Prof. K Gopinath
  • Date and Time: Monday, January 23, 2017, 3:00 PM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
Controlling memory fragmentation is critical for leveraging the benefits of huge page support offered by modern architectures. The division of free memory into non-contiguous regions over time restricts huge page allocations in long run. Compaction is a popular mechanism to recover contiguous blocks of free memory on demand. However, its success rate of accumulating free regions at huge page granularity (e.g., 2MB in x86_64) is low in most situations due to the presence of unmovable kernel memory. Hence, a prudent page placement algorithm is required to control fragmentation and protect huge pages against kernel memory allocations, in order to sustain system performance over long periods of time.

In this work, we explore the interaction of kernel pages with fragmentation avoidance and recovery mechanism in Linux. Our analysis shows that stock kernel memory layout thwarts the progress of memory compaction. Furthermore, compaction can potentially induce more fragmentation depending on where kernel memory was placed by the underlying page allocator. We discuss the scope of optimization in current Linux framework and show how an effective fragmentation management can yield up to 20% performance improvement and up to 27% energy savings with the help of additional huge pages.

Video Archive Go Top

 

Series: Department Seminar
Title: How to be fair and diverse?

  • Speaker: Mr. Amit Deshpande, Microsoft Research India
  • Date and Time: Thursday, January 19, 2017, 2:00 PM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
In data-driven decision-making, understanding algorithmic bias and correcting it is an important problem. Subsampling data is a basic algorithmic task for summarization or training algorithms. An important question is whether it is possible to produce fair subsamples that are also adequately representative of the feature space of the data. Can diversity and fairness be simultaneously ensured? We start by noting that, in some applications, guaranteeing one does not necessarily guarantee the other, and a new approach is required. Subsequently, we present an algorithmic framework which allows us to produce both fair and diverse samples. Our experimental results on an image summarization task show marked improvements in fairness without compromising feature diversity by much, giving us the best of both the worlds.

Joint work with Elisa Celis, Tarun Kathuria, Nisheeth Vishnoi.

Host Faculty: Dr. Siddharth Barman

Video Archive Go Top

 

Series: M.Sc. (Engg) Colloquium
Title: Feature Selection under Multicollinearity and Causal Inference on Time Series

  • Speaker: Indranil Bhattacharya
                   M.Sc. student at CSA
  • Faculty Advisor: Dr. Arnab Bhattacharyya
  • Date and Time: Monday, January 16, 2017, 10:00 AM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
In this work, we study and extend algorithms for Sparse Regression and Causal Inference problems. Both the problems are fundamental in the area of Data Science.

The goal of regression problem is to find out the "best" relationship between an output variable and input variables, given samples of the input and output values. We consider sparse regression under a high-dimensional linear model with strongly correlated variables, situations which cannot be handled well using many existing model selection algorithms. We study the performance of the popular feature selection algorithms such as LASSO, Elastic Net, BoLasso as well as Projected Gradient Descent algorithms under this setting in terms of their running time and stability. We also propose a new feature selection algorithm, BoPGD, which cluster the features first based on their sample correlation and do subsequent sparse estimation using a bootstrapped variant of the projected gradient descent method with projection on the nonconvex L0 ball. We attempt to characterize the efficiency and consistency of our algorithm by performing a host of experiments on both synthetic and real world datasets.

Discovering causal relationships, beyond mere correlation, is widely recognized as a fundamental problem. The Causal Inference problems uses observations to infer the underlying causal structure of the data generating process. The input to these problems is either a multivariate time series or i.i.d sequences and the output is a Feature Causal Graph where the nodes correspond to the variables and edges capture the direction of causality. For high dimensional datasets, determining the causal relationships becomes a challenging task because of the curse of dimensionality. Graphical modeling of temporal data based on the concept of "Granger Causality" has gained much attention in this context. The blend of Granger methods along with model selection techniques, such as LASSO, enables efficient discovery of a "sparse" subset of causal variables in high dimensional settings. However, these temporal causal methods use an input parameter, L, the maximum time lag. This parameter is the maximum gap in time between the occurrence of the output phenomenon and the causal input stimulus. However, in many situations of interest, the maximum time lag is not known, and indeed, finding the range of causal effects is an important problem. In this work, we propose and evaluate a data-driven and computationally e fficient method for Granger causality inference in the Vector Auto Regressive (VAR) model without foreknowledge of the maximum time lag. We present two algorithms Lasso Granger++ and Group Lasso Granger++ which not only constructs the hypothesis feature causal graph, but also simultaneously estimates a value of maxlag (^L) by balancing the trade-off between "goodness of fit" and "model complexity".

Video Archive Go Top

 

Series: Department Seminar
Title: Runtime Visualization and Verification of Java Programs

  • Speaker: Bharat Jayaraman
                   State University of New York (Buffalo)
  • Date and Time: Friday, January 13, 2017, 3:00 PM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
This talk will present a state-of-the-art analysis and visualization system called JIVE (Java Interactive Visualization Environment), an open-source plug-in for Eclipse. JIVE is based on two synergistic ideas: query-based debugging and run-time visualizations. The queries are mainly temporal in nature and the visualizations cater to individual states as well as the history of execution, using UML-like notations. More recently, we have been experimenting with ‘runtime verification’ by extracting finite state models of program behavior, and checking properties of these models stated in temporal logic and consistency of these models against design-time specifications. The talk will include a demo of the JIVE system, which may be downloaded from http://www.cse.buffalo.edu/jive.

Speaker Bio:
Dr. Bharat Jayaraman is a Professor in the Department of Computer Science & Engineering at the State University of New York at Buffalo. He received his B. Tech and M.Tech from IIT-Madras and his Ph.D from the University of Utah, Salt Lake City, USA. After receiving his Ph.D., he was on the faculty at the University of North Carolina (Chapel Hill) from 1981 to 1989, and subsequently moved to the University at Buffalo, where he also served as Department Chair from 2001 to 2009. His research interests center around programming languages and software systems. He is presently an Associate Editor for Springer’s Transactions on ICT, and has also been on the Editorial Boards for the Journal of Functional Logic Programming, and guest editor for the Computer Journal, published by Oxford University.

Host Faculty: Dr. Murali Krishna Ramanathan

Video Archive Go Top

 

Series: M.Sc. (Engg) Colloquium
Title: On the gap between outcomes of voting rules

  • Speaker: Anurita Mathur
                   M.Sc student
                   Department of Computer Science and Automation
  • Faculty Advisor: Dr. Arnab Bhattacharyya
  • Date and Time: Friday, January 13, 2017, 1:30 PM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
Through the course of history, a variety of voting rules have been used to determine election outcomes. The traditional way in social choice theory to evaluate a voting rule is by checking whether it satisfies axioms deemed to be desirable for voting rules. Celebrated results in social choice theory say that even for a small set of seemingly necessary axioms, no voting rule exists satisfying them. In the face of such impossibility results, it becomes challenging to justify why certain voting rules work well in practice. Although in theory, these rules may yield drastically different outcomes, for real-world datasets, behavioural social choice analyses have found that the rules are often in perfect agreement with each other! This work attempts to give a mathematical explanation of this phenomenon.

In this work, we formulate a quantitative approach towards comparing voting rules by viewing them as two players engaged in a zero-sum game. If rule A selects candidate c_1 while rule B selects candidate c_2, the payoff for A is the number of voters who prefer c_1 to c_2 minus the number who prefer c_2 to c_1.The optimal voting rule according to this criterion (corresponding to the optimal randomized strategy for the game) is the "game-theoretic rule" (GT) while the best deterministic strategy is the well-known Minimax voting rule. We investigate rigorously how various common voting rules fare against each other in terms of the minimum payoff they receive for arbitrary voting profiles. We also study the case when the voting profiles are drawn from a mixture of multinomial logit distributions. This scenario corresponds to a population consisting of a small number of groups, each voting according to a latent preference ranking. We supplement the theoretical findings by empirically comparing the payoff of voting rules when they are applied to user preferences for movies as determined from the Netfix competition dataset. On this dataset, we find that the Minimax rule, the Schulze rule, and a deterministic variant of the GT rule perform the best in our framework.

Video Archive Go Top

 

Series: Department Seminar
Title: Formal Verification of Safety and Stability of Hybrid Systems

  • Speaker: Pavithra Prabhakar
  • Date and Time: Thursday, January 12, 2017, 11:30 AM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
Hybrid systems refer to systems exhibiting mixed discrete-continuous behaviors and arise as a natural byproduct of the interaction of a network of embedded processors with physical systems. Hybrid systems manifest in safety critical application domains including aeronautics, automotive, robotics and power systems. Reliability is of utmost importance, and a grand challenge in the area is the development of techniques and tools to aid the development of high-confidence hybrid systems.

In this talk, we focus on the verification of two fundamental properties of hybrid systems, namely, safety and stability. Safety specifies that a bad event does not happen along any execution of the system. Stability is a robustness property that captures the notion that small perturbations to the initial state or input to the system result in only small variations in the eventual behavior of the system. We will provide a brief overview of abstraction based analysis techniques being investigated in our group for safety and stability.

Speaker Bio:
Pavithra Prabhakar is an assistant professor and a Keystone Research Scholar in the Computer Science department at Kansas State University. She obtained her doctorate in Computer Science and a masters in Applied Mathematics from the University of Illinois at Urbana-Champaign, and prior to that her masters from the Indian Institute of Science, Bangalore. She has previously been on the faculty of IMDEA Software Institute and held a CMI postdoctoral fellowship at the California Institute of Technology. Her main research interest is in formal analysis of cyber-physical systems with emphasis on both foundational and practical aspects related to automated and scalable techniques for verification and synthesis of hybrid systems. She is the recipient of the Marie Curie Career Integration Grant from the EU and the NSF CAREER Award.

Host Faculty: Deepak D'Souza

Video Archive Go Top

 

Series: M.Sc. (Engg) Thesis Defense
Title: Multi-label classification with multiple label correlation orders and structures

  • Speaker: Ms.Anusha Posinasetty
  • Faculty Advisor: Prof. Shirish K. Shevade
  • Date and Time: Wednesday, January 11, 2017, 11:00 AM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
Multilabel classification has attracted much interest in recent times due to the wide applicability of the problem and the challenges involved in learning a classifier for multilabeled data. A crucial aspect of multilabel classification is to discover the structure and order of correlations among labels and their effect on the quality of the classifier. In this work, we propose a structural SVM based framework which enables us to systematically investigate the importance of label correlations in multilabel classification. The proposed framework is very flexible and provides a unified approach to handle multiple correlation orders and structures in an adaptive manner and helps to effectively assess the importance of label correlations in improving the generalization performance. We perform extensive empirical evaluation on several datasets from different domains and present results on various performance metrics. Our experiments provide for the first time, interesting insights into the following questions: a) Are label correlations always beneficial in multilabel classification? b) What effect do label correlations have on multiple performance metrics typically used in multilabel classification? c) Is label correlation order significant and if so, what would be the favorable correlation order for a given dataset and a given performance metric? and d) Can we make useful suggestions on the label correlation structure?

Video Archive Go Top

 

Series: Department Seminar
Title: From Weak to Strong LP Gaps for all CSPs

  • Speaker: Dr. Madhur Tulsiani
                   Toyota Technological Institute
                   Chicago
  • Date and Time: Monday, January 09, 2017, 4:00 PM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
We study the approximability of constraint satisfaction problems (CSPs) by linear programming (LP) relaxations. We show that for every CSP, the approximation obtained by a basic LP relaxation, is no weaker than the approximation obtained using relaxations given by Omega( log n / (log log n)) levels of the of the Sherali-Adams hierarchy on instances of size n.

It was proved by Chan et al. [FOCS 2013] that any polynomial size LP extended formulation is no stronger than relaxations obtained by a super-constant levels of the Sherali-Adams hierarchy.. Combining this with our result also implies that any polynomial size LP extended formulation is no stronger than the basic LP.

Using our techniques, we also simplify and strengthen the result by Khot et al. [STOC 2014] on (strong) approximation resistance for LPs. They provided a necessary and sufficient condition under which Omega(log log n) levels of the Sherali-Adams hierarchy cannot achieve an approximation better than a random assignment. We simplify their proof and strengthen the bound to Omega(log n / (log log n)) levels.

Joint work with Mrinalkanti Ghosh.

Host Faculty: Dr. Anand Louis

Video Archive Go Top

 

Series: M.Sc. (Engg) Thesis Defense
Title: On algebraic and analytic properties of polynomials over fi nite fields

  • Speaker: Chetan Gupta
                   M.Sc student at CSA
  • Faculty Advisor: Dr. Arnab Bhattacharyya
  • Date and Time: Monday, January 09, 2017, 12:00 PM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
In this work, we provide some new interesting results in the emerging theory of higher-order Fourier analysis. The goal of this theory is to understand the connection between the algebraic structure and analytic properties of polynomials (and functions, generally) at a more detailed level. For instance, say Problem 1: what is the tradeoff between the equidistribution of chi(P) and its "structure" for any additive character chi and polynomial P over a finite field?

Specifically, this thesis consists of two major results. Our first result solves the Problem 1 over general fields. Previously, most of the work on the problem was over fields of prime order. We extend the tools of higher-order Fourier analysis to analyze functions over general finite fields. Let K be a field extension of a prime field F_p. We show: if P: K^n -> K is a polynomial of degree <= d, and |E[chi(P(x))] | > |K|^{-s} for some s > 0 and non-trivial additive character chi, then P is a function of O_{d, s}(1) many non-classical polynomials of weight degree < d. The definition of non-classical polynomials over non-prime fields is one of the contributions of this work.

Next, we explore the possibility of using the techniques of higher-order Fourier analysis in practice. Our second result is we design algorithms inspired by higher-order Fourier analysis for decomposing compositions of low-degree polynomials over prime fields. Specifically, our algorithm applies to the case when P=sum_{i=1}^s Q_i * R_i for some s > 0 and randomly chosen quadratic polynomials Q_i, R_i. The goal here is to retrieve the quadratic polynomials given P, and this seems to be out of the reach for known standard factorization algorithms.

Finally, we perform experiments for the general polynomial decomposition algorithm in [Bha14, BHT15} which show that the algorithm can be very inefficient in practice.

Video Archive Go Top

 

Series: Department Seminar
Title: A Fine-Grained Approach for Designing (Time and Space) Efficient Algorithms

  • Speaker: Dr Rajesh Chitnis, Weizmann Institute of Science, Israel
  • Date and Time: Thursday, January 05, 2017, 11:00 AM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
The classical approach for designing algorithms measures the time complexity only as a measure of the input size. In the 90's, Downey and Fellows proposed a fine-grained approach for NP-hard problems: one or more parameters of the input instance are defined, and we investigate the effect of these parameters on the running time. The goal is to design algorithms that work efficiently if the parameters of the input instance are small, even if the size of the input is large. Formally, a problem is said to be fixed-parameter tractable (FPT) with respect to parameter k if the problem can be solved in time f(k)·n^{O(1)} where f is a computable function and n is the input size.

In the first part of the talk, I will give a brief overview of FPT algorithms. This active area of research has seen many new developments over the last few years. The standard techniques for designing FPT algorithms on undirected graphs such as bidimensionality, treewidth-based dynamic programming, etc. seem to break down for directed graphs. I will present some of my work which overcomes this barrier, and designs optimal FPT and XP algorithms for cut and connectivity problems on directed graphs.

In the second part of the talk, I will present my recent work on parameterized streaming algorithms. For the Maximum Matching problem, I will describe optimal streaming algorithms in various streaming models such as insertion-only, promised and general insertion-deletion.

Speaker Bio:
Rajesh Chitnis is a postdoctoral fellow in the Faculty of Mathematics and Computer Science at Weizmann Institute of Science, Israel. He obtained his PhD from the University of Maryland in 2014. He is interested in theoretical computer science, in particular on designing optimal fine-grained algorithms for graph problems. He has been a recipient of several honors including Simons Award for Graduate Students in Theoretical Computer Science (2013-15), Best Paper in European Symposium on Algorithms (ESA) 2013, Larry S. Davis Doctoral Dissertation Award from University of Maryland (2014), and Board of Visitor's Award for Outstanding Graduate Student at University of Maryland (2014).

Video Archive Go Top

 

Series: Ph.D. Thesis Defense
Title: Consistency of Spectral Algorithms for Hypergraphs under Planted Partition model

  • Speaker: Debarghya Ghoshdastidar
  • Faculty Advisor: Prof. Ambedkar Dukkipati
  • Date and Time: Monday, January 02, 2017, 11:00 AM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
Hypergraph partitioning lies at the heart of a number of problems in machine learning as well as other engineering disciplines. While partitioning uniform hypergraphs is often required in computer vision problems that involve multi-way similarities, non-uniform hypergraph partitioning has applications in database systems, circuit design etc. As in the case of graphs, it is known that for given objective and balance constraints, the problem of optimally partitioning a hypergraph is NP-hard. Yet, over the last two decades, several efficient heuristics have been studied in the literature and their empirical success is widely appreciated. In contrast to the extensive studies related to graph partitioning, the theoretical guarantees of hypergraph partitioning approaches have not received much attention in the literature. The purpose of this thesis is to establish the statistical error bounds for certain spectral algorithms for partitioning uniform as well as non-uniform hypergraphs.

The mathematical framework considered in this thesis is the following. Let $Vc$ be a set of $n$ vertices, and $psi:Vcto{1,ldots,k}$ be a (hidden) partition of $Vc$ into $k$ classes. A random hypergraph $(Vc,Ec)$ is generated according to a planted partition model, ie subsets of $Vc$ are independently added to the edge set $Ec$ with probabilities depending on the class memberships of the participating vertices. Let $psi'$ be the partition of $Vc$ obtained from a certain algorithm acting on a random realization of the hypergraph. We provide an upper bound on the number of disagreements between $psi$ and $psi'$. To be precise, we show that under certain conditions, the asymptotic error is $o(n)$ with probability $(1-o(1))$. In the existing literature, such error rates are only known in the case of graphs (Rohe et al., Ann. Statist., 2011; Lei & Rinaldo, Ann. Statist., 2015), where the planted model coincides with the popular stochastic block model. Our results are based on matrix concentration inequalities and perturbation bounds, and the derived bounds can be used to comment on the consistency of spectral hypergraph partitioning algorithms.

It is quite common in the literature to resort to a spectral approach when the quantity of interest is a matrix, for instance, the adjacency or Laplacian matrix for graph partitioning. This is certainly not true for hypergraph partitioning as the adjacency relations cannot be encoded into a symmetric matrix as in the case of graphs. However, if one restricts the problem to $m$-uniform hypergraphs for some $mgeq2$, then a symmetric tensor of order $m$ can be used to express the multi-way interactions or adjacencies. Thus, the use of tensor spectral algorithms, based on the spectral theory of symmetric tensors, is a natural choice in this scenario. We observe that a wide variety of uniform hypergraph partitioning methods studied in the literature can be related to any one of two principle approaches: (1) solving a tensor trace maximization problem, or (2) use of the higher order singular value decomposition of tensors. We derive statistical error bounds to show that both these approaches lead to consistent partitioning algorithms.

Our results also hold when the hypergraph under consideration allows weighted edges, a situation that is commonly encountered in computer vision applications such as motion segmentation, image registration etc. In spite of the theoretical guarantees, a tensor spectral approach is not preferable in this setting due to the time and space complexity of computing the weighted adjacency tensor. Keeping this practical scenario in mind, we prove that consistency can still be achieved by incorporating certain tensor sampling strategies. In particular, we show that if the edges are sampled according to certain distribution, then consistent partitioning can be achieved with only few sampled edges. Experiments on benchmark problems demonstrate that such sampled tensor spectral algorithms are indeed useful in practice.

While vision tasks mostly involve uniform hypergraphs, in database and electronics applications, one often finds non-uniform hypergraphs with edges of varying sizes. These hypergraphs cannot be expressed in terms of adjacency matrices or tensors, and hence, use of a spectral approach is tricky in this context. The partitioning problems gets more challenging due to the fact that, in practice, these hypergraphs are quite sparse, and hence, provide less information about the partition. We consider spectral algorithms for partitioning clique and star expansions of hypergraphs, and study their consistency under a sparse planted partition model.

The results of hypergraph partitioning can be further extended to address the well-known hypergraph vertex coloring problem, where the objective is to color the vertices such that no edge is monochromatic. The hardness of this problem is well established. In fact, even when a hypergraph is bipartite or 2-colorable, it is NP-hard to find a proper 2-coloring for it. We propose a spectral coloring algorithm, and show that if the non-monochromatic subsets of vertices are independently added to the edge set with certain probabilities, then with probability $(1-o(1))$, our algorithm succeeds in coloring bipartite hypergraphs with only two colors.

To the best our knowledge, these are the first known results related to consistency of partitioning general hypergraphs.

Host Faculty: Prof. Ambedkar Dukkipati

Video Archive Go Top

 

 

 

 

 

 

 

 

 

 

Copyright: CSA, IISc 2016      Phone: +91-80-22932368          Fax: +91-80-23602911 Travel Blog    Feedback    Credits