Department of Computer Science and Automation Department of Computer Science and Automation, IISc, Bangalore, India Indian Institute of Science
HOME | ABOUT US | PEOPLE | RESEARCH | ACADEMICS | FACILITIES | EVENTS / SEMINARS | NEWS | CONTACT US

UPCOMING SEMINARS

Series: M.Sc. (Engg) Thesis Defense
Title: Automatic Optimization of Geometric Multigrid Methods using a DSL Approach

  • Speaker: Mr. Vinay Vasista
                   M.Sc. (Engg) Student
                   Dept. of CSA
                   IISc
  • Faculty Advisor: Prof. Uday Kumar Reddy B
  • Date and Time: Friday, June 30, 2017, 11:00 AM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
The Geometric Multigrid (GMG) method is widely used in numerical analysis to accelerate the convergence of partial differential equations solvers using a hierarchy of grid discretizations. Multiple grid sizes and recursive expression of multigrid cycles make the task of program optimization tedious and error-prone. A high-level language that aids domain experts to quickly express complex algorithms in a compact way using dedicated constructs for multigrid methods and with good optimization and parallelization support is thus valuable.

In this work, we demonstrate how high performance can be achieved along with enhanced programmability for GMG, with new language and optimization support in the PolyMage domain specific language framework. We compare our approach with (a) hand-optimized codes, (b) hand-optimized codes optimized in conjunction with techniques based on the polyhedral compiler framework, and (c) the existing PolyMage optimizer adapted for Multigrid. We use benchmarks varying in Multigrid cycle structure, smoothing steps, and other aspects, in addition to the NAS Multigrid benchmark for evaluation.

On a 24-core Intel Xeon Haswell multicore system, our automatically optimized codes achieve a mean improvement of 3.2x over a straightforward parallelization, and an improvement of 1.31x over the PolyMage DSL optimizer. In most cases, we were also able to match or outperform hand-optimized implementations available. The results demonstrate the effectiveness of the approach on delivering high performance and high programmer productivity.

Speaker Bio:
Vinay Vasista is an M.Sc student in thr Multicore Computing Lab, Department of CSA, Indian Institute of Science. He got his bachelor's degree in Computer Science from Sri Jayachamarajendra College of Engineering, Mysore in 2013. His interests are in High Performance Computing, Parallel programming, Compilers and program optimization. He is currently working as Software Developer at Mentor Graphics Corporation, India.

Video Archive Go Top

 

Series: Ph.D. Colloquium
Title: Static analysis and automated testing of file-processing programs

  • Speaker: M. Raveendra Kumar
  • Faculty Advisor: K. V. Raghavan
  • Date and Time: Thursday, June 29, 2017, 11:30 AM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
Programs that process data residing in files are widely used in domains such as banking, healthcare, networking, and log-file analysis. For these programs, there is a strong need for automated tool support for analysis and transformation tasks such as bug detection, program understanding, and program re-modularization. However, due to the complex idioms used in file-processing programs, precise analysis of these programs as well as automated testing of these programs is a challenging task. These are the problems that we address in this thesis.

Our first contribution is about dataflow analysis of file-processing programs. Our key insight here is that the analysis of a file processing program can be made more precise and useful if knowledge of the input file format of the program is made available to the analysis and incorporated into the transfer functions. We show empirical results for this using standard dataflow problems. We also develop two novel applications of this technique -- the first one to specialize a program to a given "restricted" input file format, and the second one to verify if a program "conforms" to a given input file format. Experiments on a set of benchmark programs showed that our approach provides precise output compared to naive approaches.

For precise dataflow analysis of programs with multiple procedures, *context sensitive* inter-procedural data flow analysis is required. It is very expensive in its full form. As our second contribution, we propose a *prefix* approximation for context-sensitive analysis, wherein a prefix of the full context stack is used to tag dataflow facts. Our technique, which is in contrast with the standard *suffix* approximation technique, is designed to be more scalable on programs that have distinct *application* and *library* layers, while offering comparable or even better precision within the application layer. We analyzed several large enterprise file-processing programs using an implementation of our technique, and compared it with the suffix approximation technique. Our approach's performance was several times better than that of the suffix approximation, while generally not suffering any precision penalty in the application layer of the programs. It is notable that this approach could be equally applicable in the context of non file-processing programs as well.

Our final contribution is regarding automated testing of file-processing programs. *Fuzz testing* is an automated software-testing technique that aims to uncover run-time errors by executing the target program with a large number of inputs generated automatically and systematically. *Grey box* fuzzing is a kind of fuzzing that employs lightweight instrumentation of the program, and observes the behaviors exhibited on a set of test runs. It then uses this information to generate new test inputs that might exercise other behaviors. This process is iterated until as many behaviors as possible get exhibited. AFL is a state-of-the-art fuzzer for file-processing programs. With AFL, if a test input causes a new region of code in the program to be reached, it is considered as a new behavior. In other words, code coverage is the objective. However, a vulnerability at a program point may not get exploited in every visit to the program point; only certain program paths that lead to that point may expose the vulnerability. We extend grey-box fuzzing to take into account a given vulnerability of interest (e.g., buffer overruns), and to try explicitly to traverse paths that are likely to expose such vulnerabilities. We have implemented our approach by enhancing AFL, in the context of the buffer overrun vulnerability. We evaluated our approach on a suite of benchmark programs, and compared it with AFL in its standard form. Our approach could uncover many more number of buffer overflow errors in these benchmarks compared to standard AFL for the same given amount of fuzzing time.

Speaker Bio:
M. Raveendra Kumar is a PhD student at the Department of CSA, IISc. He is also a researcher with TCS Innovation Labs. His interests are in static analysis and dynamic analysis of enterprise applications.

Video Archive Go Top

 

Series: M.Sc. (Engg) Colloquium
Title: Deep learning models for Few-shot and Metric Learning

  • Speaker: Akshay Mehrotra
  • Faculty Advisor: Prof. Ambedkar Dukkipati
  • Date and Time: Tuesday, June 27, 2017, 12:00 PM
  • Venue: CSA Multimedia Class (Room No. 252, First Floor)

Abstract
Deep neural networks achieve unprecedented performance levels over many tasks in the traditional supervised setting and scale well with large quantities of data. On the other hand improving performance in the low-data regime is understudied and the sheer number of parameters versus small amounts of data makes it a challenging task. The problem of learning from a very small amount is known as few-shot learning. A related problem is that of metric learning over disjoint training and testing classes, where the task is to learn a function that maps a data point to a low-dimensional manifold such that similar points are closer in the low-dimensional space. In this thesis, we propose a couple of deep learning approaches for few-shot learning, and then extend them to the related problem of metric learning over disjoint training and testing classes.

We first argue that a more expressive pairwise similarity matching component is crucial for solving the few-shot learning problem. Towards that end, a network design with a learnable and more expressive similarity objective is proposed by extending the deep residual network. This residual pairwise network approximates a learned metric in the representation space and outperforms previous state-of-the-art on the challenging mini-Imagenet dataset for few-shot learning by getting over 55% accuracy for the 5-way classification task over unseen classes. We also evaluate the generalization behaviour of deep residual networks with a varying number of parameters over classes not observed during training.

Next, since regularization plays a key role in learning with small amounts of data, an additional generator network is proposed by extending the Generative Adversarial Network (GAN) framework for disjoint training and testing classes. This provides a strong regularizer by leveraging the generated data samples. The proposed model can generate plausible variations of exemplars over unseen classes and outperforms strong discriminative baselines with L2 regularization for few shot classification tasks.

Finally, the idea of regularizing by adversarial generation is extended to the metric learning setting over disjoint training and testing classes. The proposed model is an efficient alternative to the hard negative mining scheme necessary when training with triplets in the large margin nearest neighbours setting. The proposed model shows performance and efficiency improvement over models trained without negative-mining or with approximations commonly used for metric learning with triplet loss respectively.

Video Archive Go Top

 

Series: Department Seminar
Title: Provably Secure Computing using Certified Software and Trusted Hardware

  • Speaker: Prof. Sanjit A. Seshia, EECS Dept., UC-Berkeley
  • Date and Time: Tuesday, June 27, 2017, 11:00 AM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
Security-critical applications face a range of threats including exploits in privileged software such as the operating system and hypervisor. In order to protect sensitive data from such privileged adversaries, there is increasing interest in developing and using secure hardware primitives, such as Intel SGX, ARM TrustZone, and Sanctum RISC-V extensions. In this talk, I will discuss the problem of building applications with provable security guarantees by including only these hardware primitives (and very little software) in the trusted computing base. We have developed compiler and verification techniques to develop applications that can be certified (at the level of machine code) to not leak secrets, both via their outputs and certain side channels. Furthermore, I will discuss how formal models of trusted hardware can be leveraged to port programs and their correctness proofs across different hardware platforms.

Speaker Bio:
Sanjit A. Seshia is a Professor in the Department of Electrical Engineering and Computer Sciences at the University of California, Berkeley. He received an M.S. and Ph.D. in Computer Science from Carnegie Mellon University, and a B.Tech. in Computer Science and Engineering from IIT-Bombay. His research interests are in dependable computing and computational logic, with a current focus on applying automated formal methods to problems in cyber-physical systems, computer security, electronic design automation, and synthetic biology. His Ph.D. thesis work on the UCLID verifier and decision procedure helped pioneer the area of satisfiability modulo theories (SMT) and SMT-based verification. He is co-author of a widely-used textbook on embedded, cyber-physical systems and has led the development of technologies for cyber-physical systems education based on formal methods. His awards and honors include a Presidential Early Career Award for Scientists and Engineers (PECASE), an Alfred P. Sloan Research Fellowship, the Frederick Emmons Terman Award for contributions to electrical engineering and computer science education, and the School of Computer Science Distinguished Dissertation Award at Carnegie Mellon University.

Host Faculty: Prof. Vinod Ganapathy

Video Archive Go Top

 

PAST SEMINARS

Series: Department Seminar
Title: Understanding the cost and the opportunity to enhance programmability of heterogeneous processors

  • Speaker: Dr Arkaprava Basu, AMD Research, Austin TX
  • Date and Time: Monday, June 19, 2017, 11:00 AM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
As the benefits from transistor scaling diminish, the computing is turning increasingly heterogeneous. In a heterogeneous processor domain-specific accelerators like graphics processing unit (GPU) are coupled with general purpose CPUs for better performance and energy efficiency. However, programming such heterogeneous processors, consisting of multiple disparate computing element, is a challenge. In this talk, we will discuss two emerging features that can address this -- namely, (1) the shared virtual memory, and (2) the ability to invoke operating system services from an accelerator (e.g., GPU). Shared virtual memory or SVM makes it easy to share data across the CPU and accelerators by presenting them with the same view of the memory. Here, we will discuss possible performance pitfall of the SVM implementation in a heterogeneous processor and the opportunities to enhance its future iterations. Next, we will discuss how an accelerator's ability to invoke operating system services like page fault, file access, is crucial to make it a true peer to the CPU. We will then discuss the performance implications of this capability and avenues for betterment and further exploration.

Finally, we will conclude the talk by discussing several future research opportunities to embrace heterogeneity across the hardware and the software stack and across the compute and the memory.

Speaker Bio:
Arkaprava (Arka) is a researcher with AMD Research at Austin,TX. Arka’s research interest broadly lies in the area of computer architecture. Since joining AMD Research in January of 2014, Arka has been exploring various aspects of virtual memory management techniques and heterogeneous processing as part of US Department of Energy’s Exascale computing initiative. Before that he obtained his PhD in Computer Science from the University of Wisconsin-Madison under the supervision of Prof. Mark D. Hill and Prof. Michael M. Swift in December, 2013. Prior to that Arka obtained M.Tech in computer science from IIT Kanpur and B.Tech from Government Engineering College in Kalyani, West Bengal.

Host Faculty: Prof. Matthew Jacob

Video Archive Go Top

 

Series: Department Seminar
Title: Rethinking TLB Designs in Virtualized Environments: A Very Large Part-of-Memory TLB

  • Speaker: Dr. Nagendra Gulur Dwarakanath, Texas Instruments
  • Date and Time: Friday, June 16, 2017, 12:00 PM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
With increasing deployment of virtual machines for cloud services and server applications, memory address translation overheads in virtualized environments have received great attention. In the radix-4 type of page tables used in x86 architectures, a TLB miss necessitates up to 24 memory references for one guest to host translation. While dedicated page walk caches and such recent enhancements eliminate many of these memory references, our measurements on the Intel Skylake processors indicate that many programs in virtualized mode of execution still spend hundreds of cycles for translations that do not hit in the TLBs.

This talk presents an innovative scheme to reduce the cost of address translations by using a very large Translation Lookaside Buffer that is part of memory, the POM-TLB. In the POM-TLB, only one access is required instead of up to 24 accesses required in commonly used 2D walks with radix-4 type of page tables. Even if many of the 24 accesses may hit in the page walk caches, the aggregated cost of the many hits plus the overhead of occasional misses from page walk caches still exceeds the cost of one access to the POM-TLB. Since the POM-TLB is part of the memory space, TLB entries (as opposed to multiple page table entries) can be cached in large L2 and L3 data caches, yielding signifiicant benefits. Through detailed evaluation running SPEC, PARSEC and graph workloads, we demonstrate that the proposed POM-TLB improves performance by approximately 10% on average. The improvement is more than 16% for 5 of the benchmarks. It is further seen that a POM-TLB of 16MB size can eliminate nearly all TLB misses in 8-core systems.

Speaker Bio:
Dr. Nagendra Gulur Dwarakanath works for Texas Instruments, in the area of executable specifications for automotive processors and micro controllers. He obtained his M.E. and PhD degrees from IISc. His research interests are primarily in Computer Architecture with specific focus on memory systems, heterogeneous architectures, accelerators, emerging workloads and programming models.

Host Faculty: R. Govindarajan

Video Archive Go Top

 

Series: Department Seminar
Title: Static Deadlock Detection for Asynchronous C# Programs

  • Speaker: Mr. Anirudh Santhiar
                   Ph.D Student
                   Dept. of CSA
                   IISc
  • Faculty Advisor: Prof. Aditya Sunil Kanade
  • Date and Time: Wednesday, June 14, 2017, 12:00 PM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
Asynchronous programming is a standard approach for designing responsive applications. Modern languages such as C# provide async/await primitives for the disciplined use of asynchrony. In spite of this, programs can deadlock because of incorrect use of blocking operations along with non-blocking (asynchronous) operations. While developers are aware of this problem, there is no automated technique to detect deadlocks in asynchronous programs.

We present a novel representation of control flow and scheduling of asynchronous programs, called the continuation scheduling graph, and formulate necessary conditions for a deadlock to occur in a program. We design static analyses to construct continuation scheduling graphs of asynchronous C# programs and to identify deadlocks in them. We have implemented the static analyses in a tool called DeadWait. Using DeadWait, we found 43 previously unknown deadlocks in 11 asynchronous C# libraries. We reported the deadlocks to the library developers. They have confirmed and fixed 40 of them.



*This is a practice talk for PLDI 2017*

Video Archive Go Top

 

Series: Department Seminar
Title: Cumulative prospect theory meets bandits and reinforcement learning

  • Speaker: Dr. Prashanth L.A
                   Department of Computer Science and Engineering
                   Indian Institute of Technology Madras
  • Date and Time: Monday, June 12, 2017, 11:30 AM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
Traditional optimization approaches rely on the central assumption that expected (value) utility appropriately captures preferences, but there is overwhelming evidence in behavioral economics research that indicates that this is not the case for human decision making. Violations of the expected value-based preferences in human-based decision making systems can be addressed by incorporating distortions in the underlying probabilities of the system. We incorporate probabilistic distortions in a stochastic optimization framework, specifically using the well-established cumulative prospect theory (CPT) of Tversky and Kahneman. The stochastic optimization setting presents two particular challenges when CPT is applied: estimating the CPT objective requires estimations of the entire distribution of the underlying random variable, and traditional optimization approaches must deal with biased (albeit asymptotically unbiased) samples of the CPT objective. We propose sample-efficient estimation schemes and use the resulting estimators for optimizing an appropriate CPT objective in both finite discrete optimization (e.g., multi-armed bandit models, ranking & selection) and in settings with continuous decision variables (e.g., policy optimization for Markov decision processes) using gradient-based and global stochastic optimization algorithms. Two transportation applications will be used to test and validate the proposed algorithms in the CPT framework.

Speaker Bio:
Prashanth L.A. is an Assistant Professor in the Department of Computer Science and Engineering at Indian Institute of Technology Madras. Prior to this, he was a postdoctoral researcher at the Institute for Systems Research, University of Maryland - College Park from 2015 to 2017 and at INRIA Lille - Team SequeL from 2012 to 2014. From 2002 to 2009, he was with Texas Instruments (India) Pvt Ltd, Bangalore, India. He received his Masters and Ph.D degrees in Computer Science and Automation from Indian Institute of Science, in 2008 and 2013, respectively. He was awarded the third prize for his Ph.D. dissertation, by the IEEE Intelligent Transportation Systems Society (ITSS). He is the coauthor of a book entitled `Stochastic Recursive Algorithms for Optimization: Simultaneous Perturbation Methods', published by Springer in 2013. His research interests are in reinforcement learning, stochastic optimization and multi-armed bandits, with applications in transportation systems, wireless networks and recommendation systems.

Host Faculty: Prof. Shalabh Bhatnagar

Video Archive Go Top

 

Series: Ph.D. Thesis Defense
Title: Ranking from Pairwise Comparisons: The Role of the Pairwise Preference Matrix

  • Speaker: Mr. Arun Rajkumar
                   Ph.D Student
                   Dept. of CSA
                   IISc
  • Faculty Advisor: Prof. Chiranjib Bhattacharyya
  • Date and Time: Friday, June 09, 2017, 9:30 AM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
Ranking a set of candidates or items from pairwise comparisons is a fundamental problem that arises in many settings such as elections, recommendation systems, sports team rankings, document rankings and so on. Indeed it is well known in the psychology literature that when a large number of items are to be ranked, it is easier for humans to give pairwise comparisons as opposed to complete rankings. The problem of ranking from pairwise comparisons has been studied in multiple communities such as machine learning, operations research, linear algebra, statistics etc., and several algorithms (both classic and recent) have been proposed. However, it is not well understood under what conditions these different algorithms perform well. In this thesis, we aim to fill this fundamental gap, by elucidating precise conditions under which different algorithms perform well, as well as giving new algorithms that provably perform well under broader conditions. In particular, we consider a natural statistical model wherein for every pair of items (i,j), there is a probability P_ij such that each time items i and j are compared, item j beats item i with probability P_ij. Such models, which we summarize through a matrix containing all these pairwise probabilities, have been used explicitly or implicitly in much previous work in the area; we refer to the resulting matrix as the pairwise preference matrix, and elucidate clearly the crucial role it plays in determining the performance of various algorithms.

In the first part of the thesis, we consider a natural generative model where all pairs of items can be sampled and where the underlying preferences are assumed to be acyclic. Under this setting, we elucidate the conditions on the pairwise preference matrix under which popular algorithms such as matrix Borda, spectral ranking, least squares and maximum likelihood under a Bradley-Terry-Luce (BTL) model produce optimal rankings that minimize the pairwise disagreement error. Specifically, we derive explicit sample complexity bounds for each of these algorithms to output an optimal ranking under interesting subclasses of the class of all acyclic pairwise preference matrices. We show that none of these popular algorithms is guaranteed to produce optimal rankings for all acyclic preference matrices. We then propose a novel support vector machine based rank aggregation algorithm that provably does so.

In the second part of the thesis, we consider the setting where preferences may contain cycles. Here, finding a ranking that minimizes the pairwise disagreement error is in general NP-hard. However, even in the presence of cycles, one may wish to rank 'good' items ahead of the rest. We develop a framework for this setting using notions of winners based on tournament solution concepts from social choice theory. We first show that none of the existing algorithms are guaranteed to rank winners ahead of the rest for popular tournament solution based winners such as top cycle, Copeland set, Markov set etc. We propose three algorithms - matrix Copeland, unweighted Markov and parametric Markov - which provably rank winners at the top for these popular tournament solutions. In addition to ranking winners at the top, we show that the rankings output by the matrix Copeland and the parametric Markov algorithms also minimize the pairwise disagreement error for certain classes of acyclic preference matrices.

Finally, in the third part of the thesis, we consider the setting where the number of items to be ranked is large and it is impractical to obtain comparisons among all pairs. Here, one samples a small set of pairs uniformly at random and compares each pair a fixed number of times; in particular, the goal is to come up with good algorithms that sample comparisons among only O(n log(n)) item pairs (where n is the number of items). Unlike existing results for such settings, where one either assumes a noisy permutation model (under which there is a true underlying ranking and the outcome of every comparison differs from the true ranking with some fixed probability) or assumes a BTL or Thurstone model, we develop a general algorithmic framework based on ideas from matrix completion, termed low-rank pairwise ranking, which provably produces an optimal ranking by comparing only O(n log(n)) pairs, O(log(n)) times each, not only for popular classes of models such as BTL and Thurstone, but also for much more general classes of models wherein a suitable transform of the pairwise probabilities leads to a low-rank matrix; this subsumes the guarantees of many previous algorithms in this setting.

Overall, our results help to understand at a fundamental level the statistical properties of various algorithms for the problem of ranking from pairwise comparisons, and under various natural settings, lead to novel algorithms with improved statistical guarantees compared to existing algorithms for this problem.

Video Archive Go Top

 

Series: Department Seminar
Title: Common Information based Markov Perfect Equilibrium in Dynamic Games with Asymmetric Information

  • Speaker: Dr. Abhishek Gupta
  • Date and Time: Thursday, June 08, 2017, 3:00 PM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
In many engineering systems, agents compete to optimize their own objective functions. Agents may have different information. For instance, when a control system is attacked by a malicious attacker, the attacker may not know the complete details about the system or the underlying states of the system. Motivated by these considerations, we developed a framework to compute Nash equilibrium in dynamic games with asymmetric information.

To compute the equilibrium, we divided the agents’ information into common information and private information. We show that the belief on state and private information given the common information forms a Markov state of a certain high dimensional virtual game. Under certain assumptions, we can compute a Markov perfect equilibrium of this new virtual game, which can be projected into the original game to compute a Nash equilibrium of the original game. We will discuss the extension of this technique to linear-quadratic-Gaussian games and finite zero-sum games.

Speaker Bio:
Abhishek Gupta is an Assistant Professor in the ECE department at The Ohio State University. Prior to joining OSU, he was a postdoctoral researcher at the University of Southern California. He received his PhD in Aerospace Engineering (2014), MS in Applied Mathematics (2012), and MS in Aerospace Engineering (2011) from UIUC, and B.Tech. in Aerospace Engineering from IIT Bombay (2009). His research interests are optimization, game theory, market design, and cybersecurity

Host Faculty: Prof. Siddharth Barman

Video Archive Go Top

 

Series: Ph.D. Colloquium
Title: Efficient Static Analyses for Concurrent Programs

  • Speaker: Suvam Mukherjee
  • Faculty Advisor: Deepak D'Souza
  • Date and Time: Thursday, June 01, 2017, 4:00 PM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
Concurrent programs are pervasive owing to the increasing adoption of multi-core systems across the entire computing spectrum. Unfortunately, writing correct and efficient concurrent programs is a hard endeavour. Automatically reasoning about the behaviours of such programs is harder still -- an exponential number of behaviours are possible due to interactions between the threads. Thus, concurrent programs provide fertile grounds for a large class of insidious defects, which are difficult to detect and can cause failures.

Static analysis techniques infer semantic properties of programs without executing them. They are attractive because they are sound (they can guarantee the absence of bugs), can execute with a fair degree of automation, and do not depend on test cases. However, current static analyses techniques for concurrent programs are either precise and prohibitively slow, or fast but imprecise. In this thesis, we partially address this problem by designing novel efficient static analyses for concurrent programs.

In the first part of the thesis, we provide a framework for designing and proving the correctness of data flow analyses that target data race free (DRF) programs, and which are in the same spirit as the "sync-CFG" analysis originally proposed in [De2011]. To achieve this, we first propose a novel concrete semantics for DRF programs called L-DRF, that is thread-local in nature with each thread operating on its own copy of the data state. We show that abstractions of our semantics allow us to reduce the analysis of DRF programs to a sequential analysis. This aids in rapidly porting existing sequential analyses to scalable analyses for DRF programs. Next, we parameterize the semantics with a partitioning of the program variables into "regions" which are accessed atomically. Abstractions of the region-parameterized semantics yield more precise analyses for "region-race free" concurrent programs. We instantiate these abstractions to devise efficient relational analyses for race free programs, which we have implemented in a prototype tool called RATCOP. On the benchmarks, RATCOP was able to prove up to 65% of the assertions, in comparison to 25% proved by a version of the analysis from [De2011]. In a comparative study with a recent abstract interpretation based concurrent static analyser, RATCOP was up to 5 orders of magnitude faster on a set of loosely-coupled concurrent programs.

In the second part of the thesis, we provide techniques to detect all "high-level" data races in an RTOS kernel. A high-level race occurs when an execution interleaves instructions corresponding to user-annotated critical accesses to shared memory structures. Such races are good indicators of atomicity violations. We propose a technique for detecting all high-level dataraces in a system library, like the kernel API of a real-time operating system (RTOS) that relies on flag-based scheduling and synchronization. Our methodology is based on model-checking, but relies on a meta-argument to bound the number of task processes needed to orchestrate a race. We describe our approach in the context of FreeRTOS, a popular RTOS in the embedded domain. Using our technique, a user is able to soundly disregard 99.8% of an estimated 41,000 potential high-level races. Our tool detected 38 high-level data races in FreeRTOS, out of which 16 were harmful.

Speaker Bio:
Suvam Mukherjee is a PhD student in the CSA department.

Host Faculty: Deepak D'Souza

Video Archive Go Top

 

Series: M.Sc. (Engg) Colloquium
Title: Large Scale Graph Processing in a Distributed Environment

  • Speaker: Nitesh Upadhyay
  • Faculty Advisor: Y.N. Srikant
  • Date and Time: Thursday, June 01, 2017, 12:00 PM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
Graph algorithms are ubiquitously used across domains. They exhibit parallelism, which can be exploited on parallel architectures, such as, multi-core processors and accelerators. However, real world graphs are massive in size and cannot fit into the memory of a single machine. Such large graphs are partitioned and processed in a distributed cluster environment which consist of multiple GPUs and CPUs. Existing frameworks that facilitate large scale graph processing in the distributed cluster have their own style of programming and require extensive involvement by the user in communication and synchronization aspects. Adaptation of these frameworks appears to be an overhead for a programmer. Furthermore, these frameworks have been developed to target only CPU clusters and lack the ability to harness the GPU architecture.

We provide a back-end framework to the graph Domain Specific Language, Falcon, for large scale graph processing on CPU and GPU clusters. The Motivation behind choosing this DSL as a front-end is its shared-memory based imperative programmability feature. Our framework generates Giraph code for CPU clusters. Giraph code runs on the Hadoop cluster and is known for scalable and fault-tolerant graph processing. For GPU cluster, Our framework applies a set of optimizations to reduce computation and communication latency, and generates efficient CUDA code coupled with MPI.

Experimental evaluations show the scalability and performance of our framework for both CPU and GPU clusters. The performance of the framework generated code is comparable to the manual implementations of various algorithms in distributed environments.

Speaker Bio:
Nitesh Upadhyay is an M.Sc(Engg.) student in the CSA department.

Host Faculty: Y.N. Srikant

Video Archive Go Top

 

Series: M.Sc. (Engg) Colloquium
Title: HAShCache: Heterogeneity Aware Shared DRAMCache for Integrated Heterogeneous Systems

  • Speaker: Adarsh Patil
  • Faculty Advisor: R. Govindarajan
  • Date and Time: Wednesday, May 31, 2017, 11:00 AM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
Integrated Heterogeneous System (IHS) processors pack throughput oriented GPGPUs alongside latency oriented CPUs on the same die sharing certain resources, e.g., shared last level cache, network-on-chip (NoC), and the main memory. The demands for memory accesses and other shared resources from GPU cores can exceed that of CPU cores by 2 to 3 orders of magnitude. This disparity poses significant problems in exploiting the full potential of these architectures.

In this work, we propose adding a large capacity stacked DRAM, used as a shared last level cache, for the IHS processors. However, adding the DRAMCache naively, leaves significant performance on the table due to the disparate demands from CPU and GPU cores for DRAMCache and memory accesses. In particular, the imbalance can significantly reduce the performance benefits that the CPU cores would have otherwise enjoyed with the introduction of the DRAMCache, necessitating a heterogeneity aware management of this shared resource for improved performance. Further, in this work we propose three simple techniques to enhance the performance of CPU application while ensuring very little to no performance impact to the GPU. Specifically, we propose (i) PrIS, a prioritization scheme for scheduling CPU requests at the DRAMCache controller (ii) ByE, a selective and temporal bypassing scheme for CPU requests at the DRAMCache (iii) Chaining, an occupancy controlling mechanism for GPU lines in the DRAMCache through pseudo-associativity. The resulting cache, HAShCache, is heterogeneity-aware and can adapt dynamically to address the inherent disparity of demands in an IHS architecture.

We enhance the gem5-gpu simulator to model an IHS architecture with a stacked DRAM as cache, complete with coherence and unified physical memory. Using this setup we perform detailed experimental evaluation of the proposed HAShCache which results in an average system performance improvement of 41% over a naive DRAMCache and over 200% improvement over a baseline system with no stacked DRAMCache.

Speaker Bio:
Adarsh Patil is an M.Sc[Engg] student of CSA. He has obtained his B.Tech from M.S.Ramaiah Institute of Technology and has 2 years industry experience at Goldman Sachs in the Virtualization and Linux engineering team.

Video Archive Go Top

 

Series: CSA Distinguished Lecture
Title: Old and New Algorithms for Guarding Polygons

  • Speaker: Prof. Subir Kumar Ghosh
                   Professor
                   Ramakrishna Mission Vivekananda University
  • Date and Time: Thursday, May 25, 2017, 10:30 AM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
The art gallery problem is to determine the number of guards (or cameras) that are sufficient to cover or see every point in the interior of an art gallery. An art gallery can be viewed as a polygon P (with or without holes) with a total of n vertices, and guards as points in P. Any point z in P is said to be visible from a guard g if the line segment joining z and g does not intersect the exterior of P. This problem was first posed by Victor Klee in a conference in 1973, and in the course of time, it has become one of the important problems in computational geometry with extensive applications to surveillance of buildings like airport terminals, railway stations etc.

A polygon is said to be guarded by a set of chosen guards if every interior point of P is visible from some guard within the guard set. In this lecture, we will first present an O(log n)-approximation algorithm to find the minimum number of guards necessary for guarding P, which was designed by Ghosh way back in 1986. For this problem, the approximation ratio was improved to O(loglog OPT) by King and Kirkpatrickin 2011. Ghosh had also conjectured in 1986 that a constant-factor approximation algorithm exists for this problem in case of polygons without holes. This conjecture was settled recently for a special class of P (without holes) by Bhattacharya, Ghosh and Roy, when they presented a 6-approximation algorithm for this problem. An outline of this algorithm will be provided in the lecture.

Chromatic art gallery problems arise from applications in areas like landmark-based navigation and wireless networks. One such problem is the weak-conflict free guarding of polygons, where the objective is to colour a guard set for P using minimum number of colours such that each point within P sees at least one guard with an unique colour. So finally, we will present an algorithm for weak conflict-free guarding of P (without holes)where the guard set is coloured using only O(log n) colours.

Speaker Bio:
Subir Kumar Ghosh was a professor of computer science and discrete applied mathematics at Tata Institute of Fundamental Research, Mumbai till July, 2015. Since then, he is a professor at RKM Vivekananda University, Belur, West Bengal. He is a Fellow of Indian Academy of Sciences since 1998. Recently, he has been awarded a professorship by the National Board for Higher Mathematics for his lifetime contributions to Mathematics. He worked as visiting scientists at many reputed universities and research institutes around the world, and gave several invited lectures in the national and international conferences, workshops and in the institutes in India and abroad. Currently, he is the Chair of the Steering Committee of an international conference on algorithms and discrete applied mathematics (CALDAM). He is the author of a book and around sixty six papers in the fields of computational geometry, robot path planning, geometric graph theory and algorithms (sequential, parallel, on-line and approximation).

Host Faculty: Prof. Sathish Govindarajan

Video Archive Go Top

 

Series: M.Sc. (Engg) Colloquium
Title: COMPILATION OF GRAPH ALGORITHMS FOR HYBRID, CROSS-PLATFORM AND DISTRIBUTED ARCHITECTURES

  • Speaker: Parita Patel
  • Faculty Advisor: Y.N. Srikant
  • Date and Time: Tuesday, May 23, 2017, 12:00 PM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
Graph algorithms are abundantly used in various disciplines. These algorithms perform poorly due to random memory access and negligible spatial locality. In order to improve performance, parallelism exhibited by these algorithms can be exploited by leveraging modern high performance parallel computing resources. Implementing graph algorithms for these parallel architectures requires manual thread management and memory management which becomes tedious for a programmer.

Large scale graphs cannot fit into the memory of a single machine. One solution is to partition the graph either on multiple devices of a single node or on multiple nodes of a distributed network. All the available frameworks for such architectures demand unconventional programming which is difficult and error prone.

To address these challenges, we propose a framework for compilation of graph algorithms written in an intuitive graph domain-specific language, Falcon. The framework targets shared memory parallel architectures, computational accelerators and distributed architectures (CPU and GPU cluster). First, it analyses the abstract syntax tree (generated by Falcon) and gathers essential information. Subsequently, it generates optimized code in OpenCL for shared-memory parallel architectures and computational accelerators, and OpenCL coupled with MPI code for distributed architectures. Motivation behind generating OpenCL code is its platform-agnostic and vendor-agnostic behavior, i.e., it is portable to all kinds of devices. Our framework makes memory management, thread management, message passing, etc., transparent to the user. To the best of our knowledge, none of the available domain-specific languages, frameworks or parallel libraries handle portable implementations of graph algorithms.

Experimental evaluations demonstrate that the generated code performs comparably to the state-of-the-art non-portable implementations and hand-tuned implementations. The results also show portability and scalability of our framework.

Speaker Bio:
Parita Patel is an M.Sc(Engg.) student in the CSA department.

Host Faculty: Y.N. Srikant

Video Archive Go Top

 

Series: Department Seminar
Title: Deep Learning made easy with DARVIZ

  • Speaker: Shreya Khare and Anush Sankaran
  • Date and Time: Thursday, May 18, 2017, 3:30 PM
  • Venue: CSA Lecture Hall (Room No. 117, Ground Floor)

Abstract
Struggling with implementing research papers in deep learning? Confused between which deep learning framework to use - Tensorflow, Theano, CAFFE, etc? Wasting hours of precious time in debugging code for building the right deep learning model? You will not longer face these challenges!

DARVIZ. Its a visual programming IDE, where you could design a deep learning model using an intuitive drag-and-drop framework. Once designed, DARVIZ could write the execution ready deep learning code for you in three platforms - Tensorflow, CAFFE, and Theano. This tool helps in extreme rapid prototyping of deep learning models and implementation of state of art papers.

Speaker Bio:
Shreya Khare is software engineer working in the area of deep learning and automation at IBM Research in Bangalore, India. She is currently working on DARVIZ, a deep learning IDE for implementing, visualizing and validating deep learning models. Her research interests include Machine Learning, Speech Processing, Computer Vision and Natural Language Processing. She has completed her MS by Research from Department of Computer Science, IIT Madras. Anush Sankaran is a Researcher in the Cognitive Solutions and Services team. His research interests include deep learning, image processing, human cognition, and their applications. He is now primarily leading efforts in two projects: DARVIZ and Machine Learning for Creativity. He is currently pursuing the Ph.D. degree with the Indraprastha Institute of Information Technology, New Delhi, India. He was a recipient of the TCS Ph.D. Research Fellowship from 2010 to 2015. He has written many peer-reviewed conferences and journals and also has the Best Poster Awards in the IEEE BTAS 2013 and the IEEE IJCB 2014. Anush received the B.Tech. degree (Gold Medal) in computer science from the Coimbatore Institute of Technology, Coimbatore, India, in 2010.

Host Faculty: Aditya Kanade

Video Archive Go Top

 

Series: Department Seminar
Title: Software Transactional Memory Systems for Processing Large-Scale Graph Analytics

  • Speaker: Sathya Peri
  • Date and Time: Wednesday, May 17, 2017, 3:30 PM
  • Venue: CSA Multimedia Class (Room No. 252, First Floor)

Abstract
Nowadays, multi-core systems are quite ubiquitous from cell phones to high performance computation servers. These systems were introduced around 2004, 50 years of exponential improvement in the performance of sequential computers ended. In order to get a continued speedup on these processors, applications need to be able to harness the parallelism of the underlying hardware. This is commonly achieved using multi-threading. Yet writing correct and scalable multi-threaded programs is far from trivial. In multi-threaded programs sets of semantically related actions may need to execute in mutual exclusion to avoid semantic inconsistencies.

Traditionally, multi-threaded programs were developed in conjunction with locks to address these issues. But programming with locks has many disadvantages such as deadlocks, priority inversion etc. and makes it difficult to build scalable software systems. Importantly, lock based software components are difficult to compose i.e. build larger software systems using simpler software components. In recent years, Software Transactional Memory (STM) has garnered significant interest as an elegant alternative for developing concurrent code. Software transactions address many of the shortcomings of lock based systems. Multi-threaded programs used with STMs address many of these challenges.

In the past few years, analytics over large data, (also commonly referred to as Big-Data Analytics) has become an very important in various industries. Many of the problems of large data and data-mining can be cast as problems of large-scale graph analytics problems. This would typically involve looking for operations on large graphs having millions of vertices and edges.

In this talk, I will give an overview of Software Transactional Memory (STMs). I will describe the correctness requirements of STM systems, describe its current state. Then, I will briefly describe how STMs can benefit large-scale graph analytics. I will specifically describe about how STMs can benefit the problem of graph coloring having millions of vertices.

Speaker Bio:
Dr. Sathya Peri is an Associate Professor at IIT Hyderabad. He received his masters in Computer Science from Madurai Kamaraj University, India in 2001. After working as a software engineer at HCL Technologies, India for around one year he joined University of Texas at Dallas as a graduate student. He obtained Masters degree and Ph.D. degrees in Computer Science from the University of Texas at Dallas in 2004 and 2007 respectively. He worked under the guidance of Prof. Neeraj Mittal in the area of monitoring distributed systems. Then he worked at INRIA, Rennes, France as a postdoc from Sept 2007 to Dec 2008 in the area of Gossip Protocols for Peer to Peer systems. From Jan 2009 to Apr 2010, he was a Postdoc at Memorial University working with Prof. Vidyasankar in the area of Software Transactional Memory.

Host Faculty: Deepak D'Souza

Video Archive Go Top

 

Series: M.Sc. (Engg) Colloquium
Title: An Exploratory Framework for Cyclone Identification and Tracking

  • Speaker: Akash Anil Valsangkar
  • Faculty Advisor: Prof. Vijay Natarajan
  • Date and Time: Wednesday, May 17, 2017, 11:00 AM
  • Venue: CSA Multimedia Class (Room No. 252, First Floor)

Abstract
Analyzing depressions plays an important role in meteorology, particularly in the study of cyclones. In particular, the study of the temporal evolution of cyclones requires a robust depression tracking framework. Depressions and cyclones are not well-defined objects and their shape and size characteristics change over time. Further, current extraction and tracking methods depend on many parameters and are often specialized for particular regions. In visualization literature, most of the tracking algorithms use spatial analysis for feature identification and noise removal. With state of the art methods, temporal noise leads to cluttered visualization.

We propose a pipeline for identification and tracking of cyclones over time. Our method requires only a few intuitive parameters. This method for cyclone identification within each time step is based on well-established topological concepts enabling a robust tracking. Candidate tracks are computed from an optical flow field. These tracks are clustered within a moving time window and forwarded to a final tracking step. An exploratory framework helps in the study of cyclone movement and identifying smooth, representative tracks. Multiple case studies demonstrate the effectiveness of the method in tracking cyclones, both in the northern and southern hemisphere.

Video Archive Go Top

 

Series: M.Sc. (Engg) Thesis Defense
Title: Utilizing Worker Groups and Task Dependencies in Crowdsourcing

  • Speaker: Mr. Prakhar Ojha
  • Faculty Advisor: Dr. Partha Pratim Talukdar
  • Date and Time: Monday, May 08, 2017, 11:00 AM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
Crowdsourcing has evolved from solving simpler tasks, like image-classification, to more complex tasks such as document editing, language translation, product designing etc. Unlike micro-tasks performed by a single worker, these complex tasks require a team of workers and group output is often the atomic unit of evaluation. If the task-requester is interested in making individual payments based on their respective efforts in the group, she will need a strategy to discriminate between workers (who contribute positively) from idlers (who do not contribute to group task). It is a non-trivial problem, especially since only group output is evaluated. In the first part of this talk, I shall demonstrate how ideas from Group Testing may be effectively used for this challenging problem. This is the first application of Group Testing to crowdsourcing. I shall also list out open problems in this area.

In the second part of the talk, I shall focus on Relational Crowdsourcing (RelCrowd), a novel crowdsourcing paradigm where human intelligence tasks (HITs) are created by taking task inter-dependencies. I shall describe how this framework may be used in the evaluation of large knowledge graphs (KG), we call the resulting system KGeval. Estimating accuracy of automatically constructed KGs is a challenging problem due to their size and diversity. We shall show that the objective optimized by KGEval is submodular and NP-hard, thereby allowing for approximation algorithms with guarantees. Through experiments on real-world datasets, our results demonstrate that KGEval is able to estimate KG accuracy more accurately compared to other competitive baselines, while requiring significantly lesser number of human evaluations.

Video Archive Go Top

 

Series: CSA Distinguished Lecture
Title: The Magnificent Seven

  • Speaker: Prof. Vivek Shripad Borkar
                   Institute Chair Professor
                   Department of Electrical Engineering
                   Indian Institute of Technology, Bombay
  • Date and Time: Thursday, May 04, 2017, 2:30 PM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
This talk will give an overview of restless bandits and then will introduce individually, seven restless bandits. The summary in brief is:

Bandits are a pest

if they don't get much rest,

so do relax a little

your constraints a la Whittle.

Then all that it takes

is a simple index.

Speaker Bio:
Prof. Vivek Borkar has made stellar contributions in the areas of stochastic control and optimization, reinforcement learning, random processes, and several other areas in control and optimization. He is a J.C. Bose National Fellow and a Fellow of IEEE, INSA, IASc, INAE, NASI, and TWAS. He is a recipient of several best paper awards and is widely known for many path-breaking fundamental results in the above areas.

Host Faculty: Prof. Y Narahari

Video Archive Go Top

 

 

 

 

 

 

 

 

 

 

Copyright: CSA, IISc 2017      Phone: +91-80-22932368          Fax: +91-80-23602911 Travel Blog    Feedback    Credits