Algorithms & Theory
Algorithms and theory research at CCIS seeks to answer one of the most basic question faced in computer science: Which problems can computers solve efficiently? CCIS aims to find out with rigorous, stateoftheart research that tackles fundamental algorithm problems with solutions that improve Internet security, data privacy, and network efficiency.
Areas of investigation:
 Approximation algorithms
 Computational complexity
 Cryptography
 Distributed computing
 Privacy
 Learning theory
 Network algorithms
The CCIS algorithms and theory group is well known for research in cryptography, privacy network optimization, and pseudorandomness. Our faculty works in both the core foundations of computing and in computing applications—making for a broad research environment. Students and faculty collaborate closely, forming an especially cohesive community that offers internal seminars, reading groups, recruiting meetings, and social events.
Notable achievements:
 CCIS has been awarded a National Science Foundation grant to design algorithms to more efficiently and reliably identify bugs in software and hardware systems such as cell phones, medical devices, and aviation systems.
 Our researchers constructed robust, efficient cryptographic protocols and pseudorandom generators.
 CCIS researchers developed widely used ranking and search algorithms for online documents.
 Our team created protocols that enable individual network nodes to cooperate and optimize for global objectives in adversarial environments.
 CCIS researchers developed an algorithm for a model explaining the emergence of scalefree networks in natural, technological, and social systems.
Research Labs and Groups
Related News
People
Administrative Staff Tenured and Tenure Track Faculty
Agnes H. Chan
Professor
Executive Director  Cybersecurity Programs
Tenured and Tenure Track Faculty
Ehsan Elhamifar
Assistant Professor
Tenured and Tenure Track Faculty
Larry Finkelstein
Professor Emeritus
Dean Emeritus
Tenured and Tenure Track Faculty
Huy Lê Nguyen
Assistant Professor
Administrative Staff Tenured and Tenure Track Faculty
Rajmohan Rajaraman
Professor
Associate Dean & Director  Graduate School
Tenured and Tenure Track Faculty
Jonathan Ullman
Assistant Professor
Tenured and Tenure Track Faculty
Emanuele Viola
Associate Professor
Tenured and Tenure Track Faculty
Daniel Wichs
Assistant Professor
Professors of the Practice
Misha Pavel
Professor of the Practice
Interdisciplinary with Bouvé College of Health Sciences
Research Faculty and Staff
Harriet Fell
Research Professor
Research Faculty and Staff
Carlos Gershenson
Visiting Research Professor
Part Time Faculty Teaching Faculty
Ahmed Abdeen Hamed
PartTime Lecturer
Post Docs
Siyao Guo
Postdoctoral Research Fellow
Post Docs
Mor Weiss
Postdoctoral Research Associate
PhD Students
Hridam Basu
PhD Student
PhD Students
Albert Cheu
PhD Student
PhD Students
Matthew Dippel
PhD Student
PhD Students
Dan Feng
PhD Student
PhD Students
Hamidreza Jahanjou
PhD Student
PhD Students
Mahdi Khodayar
PhD Student
PhD Students
Lucianna Kiffer
PhD Student
PhD Students
Chin Ho Lee
PhD Student
PhD Students
Mehraneh Liaee
PhD Student
PhD Students
Biswaroop Maiti
PhD Student
PhD Students
Tanay Mehta
PhD Student
PhD Students
Bochao Shen
PhD Student
PhD Students
Vikrant Singhal
PhD Student
PhD Students
Ruiyang Xu
PhD Student
PhD Students
Georgios Zirdelis
PhD Student
Saber Shokat Fadaee
PhD Student
Zahra Jafargholi
PhD Student
Alessandra Scafuro
Postdoctoral Research Associate
Recent Publications
SoK: Cryptographically Protected Database Search
Fuller, B., Varia, M., Yerukhimovich, A., Shen, E., Hamlin, A., Gadepally, V., Shay, R., Mitchell, J.D. and Cunningham, R.K., SoK: Cryptographically Protected Database Search.
Abstract
Protected database search systems cryptographically isolate the roles of reading from, writing to, and administering the database. This separation limits unnecessary administrator access and protects data in the case of system breaches. Since protected search was introduced in 2000, the area has grown rapidly; systems are offered by academia, startups, and established companies.
However, there is no best protected search system or set of techniques. Design of such systems is a balancing act between security, functionality, performance, and usability. This challenge is made more difficult by ongoing database specialization, as some users will want the functionality of SQL, NoSQL, or NewSQL databases. This database evolution will continue, and the protected search community should be able to quickly provide functionality consistent with newly invented databases.
At the same time, the community must accurately and clearly characterize the tradeoffs between different approaches. To address these challenges, we provide the following contributions:
 An identification of the important primitive operations across database paradigms. We find there are a small number of base operations that can be used and combined to support a large number of database paradigms.
 An evaluation of the current state of protected search systems in implementing these base operations. This evaluation describes the main approaches and tradeoffs for each base operation. Furthermore, it puts protected search in the context of unprotected search, identifying key gaps in functionality.
 An analysis of attacks against protected search for different base queries.
 A roadmap and tools for transforming a protected search system into a protected database, including an opensource performance evaluation platform and initial user opinions of protected search.
Binary AMD Circuits from Secure Multiparty Computation
Genkin, Daniel, Ishai, Yuval, and Weiss, Mor. "Binary AMD Circuits from Secure Multiparty Computation." Theory of Cryptography Conference. Springer Berlin Heidelberg, 2016.
Abstract
An AMD circuit over a finite field F is a randomized arithmetic circuit that offers the “best possible protection” against additive attacks. That is, the effect of every additive attack that may blindly add a (possibly different) element of F to every internal wire of the circuit can be simulated by an ideal attack that applies only to the inputs and outputs.
Genkin et al. (STOC 2014, Crypto 2015) introduced AMD circuits as a means for protecting MPC protocols against active attacks, and showed that every arithmetic circuit C over F can be transformed into an equivalent AMD circuit of size O(C) with O(1/F) simulation error. However, for the case of the binary field F=F2, their constructions relied on a tamperproof output decoder and could only realize a weaker notion of security.
We obtain the first constructions of fully secure binary AMD circuits. Given a boolean circuit Cand a statistical security parameter σ, we construct an equivalent binary AMD circuit C′ of size C⋅polylog(C,σ) (ignoring lower order additive terms) with 2−σ simulation error. That is, the effect of toggling an arbitrary subset of wires can be simulated by toggling only input and output wires.
Our construction combines in a general way two types of “simple” honestmajority MPC protocols: protocols that only offer security against passive adversaries, and protocols that only offer correctness against active adversaries. As a corollary, we get a conceptually new technique for constructing activesecure twoparty protocols in the OThybrid model, and reduce the open question of obtaining such protocols with constant computational overhead to a similar question in these simpler MPC models.
Dissimilaritybased Sparse Subset Selection
Dissimilaritybased Sparse Subset Selection, E. Elhamifar, G. Sapiro and S. Sastry, IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2016.
Finding an informative subset of a large number of data points or models is at the center of many problems in machine learning, computer vision, bio/health informatics and image/signal processing. Given pairwise dissimilarities between the elements of a `source set’ and a `target set,’ we consider the problem of finding a subset of the source set, called representatives or exemplars, that can efficiently describe the target set. We formulate the problem as a rowsparsity regularized trace minimization problem. Since the proposed formulation is, in general, an NPhard problem, we consider a convex relaxation. The solution of our proposed optimization program finds the representatives and the probability that each element of the target set is associated with the representatives. We analyze the solution of our proposed optimization as a function of the regularization parameter. We show that when the two sets jointly partition into multiple groups, the solution of our proposed optimization program finds representatives from all groups and reveals clustering of the sets. In addition, we show that our proposed formulation can effectively deal with outliers. Our algorithm works with arbitrary dissimilarities, which can be asymmetric or violate the triangle inequality. To efficiently implement our proposed algorithm, we consider an Alternating Direction Method of Multipliers (ADMM) framework, which results in quadratic complexity in the problem size. We show that the ADMM implementation allows to parallelize the algorithm, hence further reducing the computational cost. Finally, by experiments on realworld datasets, we show that our proposed algorithm improves the state of the art on the two problems of scene categorization using representative images and timeseries modeling and segmentation using representative models.
Making the Best of a Leaky Situation: ZeroKnowledge PCPs from LeakageResilient Circuits
Ishai, Yuval, Weiss, Mor, and Yang, Guang. "Making the Best of a Leaky Situation: ZeroKnowledge PCPs from LeakageResilient Circuits." Theory of Cryptography Conference. Springer Berlin Heidelberg, 2015.
Abstract
A Probabilistically Checkable Proof (PCP) allows a randomized verifier, with oracle access to a purported proof, to probabilistically verify an input statement of the form “x∈L” by querying only few bits of the proof. A zeroknowledge PCP (ZKPCP) is a PCP with the additional guarantee that the view of any verifier querying a bounded number of proof bits can be efficiently simulated given the input x alone, where the simulated and actual views are statistically close.
Originating from the first ZKPCP construction of Kilian et al. [21], all previous constructions relied on locking schemes, an unconditionally secure oraclebased commitment primitive. The use of locking schemes makes the verifier inherently adaptive, namely, it needs to make at least two rounds of queries to the proof.
Motivated by the goal of constructing nonadaptively verifiable ZKPCPs, we suggest a new technique for compiling standard PCPs into ZKPCPs. Our approach is based on leakageresilient circuits, which are circuits that withstand certain “sidechannel” attacks, in the sense that these attacks reveal nothing about the (properly encoded) input, other than the output. We observe that the verifier’s oracle queries constitute a sidechannel attack on the wirevalues of the circuit verifying membership in L, so a PCP constructed from a circuit resilient against such attacks would be ZK. However, a leakageresilient circuit evaluates the desired function only if its input is properly encoded, i.e., has a specific structure, whereas by generating a “proof” from the wirevalues of the circuit on an illformed “encoded” input, one can cause the verification to accept inputs x∉Lwith probability 1. We overcome this obstacle by constructing leakageresilient circuits with the additional guarantee that illformed encoded inputs are detected. Using this approach, we obtain the following results:

We construct the first witnessindistinguishable PCPs (WIPCP) for NP with nonadaptive verification. WIPCPs relax ZKPCPs by only requiring that different witnesses be indistinguishable. Our construction combines strong leakageresilient circuits as above with the PCP of Arora and Safra [2], in which queries correspond to sidechannel attacks by shallow circuits, and with correlation bounds for shallow circuits due to Lovett and Srivinasan [22].

Building on these WIPCPs, we construct nonadaptively verifiable computational ZKPCPs for NP in the common random string model, assuming that oneway functions exist.

As an application of the above results, we construct 3round WI and ZK proofs for NP in a distributed setting in which the prover and the verifier interact with multiple servers of which t can be corrupted, and the total communication involving the verifier consists of polylog(t)bits.
Practical Solutions For FormatPreserving Encryption
Weiss, Mor, Rozenberg, Boris, and Barham, Muhammad. "Practical Solutions For FormatPreserving Encryption." arXiv preprint arXiv:1506.04113 (2015).
Abstract
Format Preserving Encryption (FPE) schemes encrypt a plaintext into a ciphertext while preserving its format (e.g., a valid socialsecurity number is encrypted into a valid socialsecurity number), thus allowing encrypted data to be stored and used in the same manner as unencrypted data. Motivated by the alwaysincreasing use of cloudcomputing and memory delegation, which require preserving both plaintext format and privacy, several FPE schemes for general formats have been previously suggested. However, current solutions are both insecure and inefficient in practice. We propose an efficient FPE scheme with optimal security. Our scheme includes an efficient method of representing general (complex) formats, and provides efficient encryption and decryption algorithms that do not require an expensive setup. During encryption, only formatspecific properties are preserved, while all messagespecific properties remain hidden, thus guaranteeing data privacy. As experimental results show that in many cases large formats domains cannot be encrypted efficiently, we extend our scheme to support large formats, by imposing a userdefined bound on the maximal format size, thus obtaining a flexible securityefficiency tradeoff and the best possible security (under the size limitation).
Probabilistically checkable proofs of proximity with zeroknowledge
Ishai, Yuval, and Weiss, Mor. "Probabilistically checkable proofs of proximity with zeroknowledge." Theory of Cryptography Conference. Springer Berlin Heidelberg, 2014.
Abstract
A probabilistically Checkable Proof (PCP) allows a randomized verifier, with oracle access to a purported proof, to probabilistically verify an input statement of the form “x ∈ L” by querying only few bits of the proof. A PCP of proximity (PCPP) has the additional feature of allowing the verifier to query only few bits of the inputx, where if the input is accepted then the verifier is guaranteed that (with high probability) the input is close to some x′ ∈ L.
Motivated by their usefulness for sublinearcommunication cryptography, we initiate the study of a natural zeroknowledge variant of PCPP (ZKPCPP), where the view of any verifier making a bounded number of queries can be efficiently simulated by making the same number of queries to the input oracle alone. This new notion provides a useful extension of the standard notion of zeroknowledge PCPs. We obtain two types of results.

Constructions. We obtain the first constructions of queryefficient ZKPCPPs via a general transformation which combines standard queryefficient PCPPs with protocols for secure multiparty computation. As a byproduct, our construction provides a conceptually simpler alternative to a previous construction of honestverifier zeroknowledge PCPs due to Dwork et al. (Crypto ’92).

Applications. We motivate the notion of ZKPCPPs by applying it towards sublinearcommunication implementations of commitandprove functionalities. Concretely, we present the first sublinearcommunication commitandprove protocols which make a blackbox use of a collisionresistant hash function, and the first such multiparty protocols which offer informationtheoretic security in the presence of an honest majority.
Select Projects
Network Algorithms Under Adversarial and Stochastic Uncertainty
Lead PI
Funding
Network Algorithms Under Adversarial and Stochastic Uncertainty
This project studies the design of highly robust networked systems that are resilient to extreme failures and rapid dynamics, and provide optimal performance under a wide spectrum of scenarios with varying levels of predictability.
Modern information networks are composed of heterogeneous nodes and links, whose capacities and capabilities change unexpectedly due to mobility, failures, maintenance, and adversarial attacks. User demands and critical infrastructure needs, however, require that basic primitives including access to information and services be always efficient and reliable. This project studies the design of highly robust networked systems that are resilient to extreme failures and rapid dynamics, and provide optimal performance under a wide spectrum of scenarios with varying levels of predictability.
The focus of this project will be on two problem domains, which together address adversarial network dynamics and stochastic network failures. The first component is a comprehensive theory of information spreading in dynamic networks. The PI will develop an algorithmic toolkit for dynamic networks, including local gossipstyle protocols, network coding, random walks, and other diffusion processes. The second component of the project concerns failureaware network algorithms that provide high availability in the presence of unexpected and correlated failures. The PI will study failureaware placement of critical resources, and develop flow and cut algorithms under stochastic failures using techniques from chanceconstrained optimization. Algorithms tolerant to adversarial and stochastic uncertainty will play a critical role in largescale heterogeneous information networks of the future. Broader impacts include student training and curriculum development.
The Control Principles of Complex Systems
Lead PI
Co PI
Funding
The Control Principles of Complex Systems
Many of the truly difficult problems limiting advances in contemporary science are rooted in our limited understanding of how complex systems are controlled. Indeed, in human cells millions of molecules are embedded in a complex genetic network that lacks an obvious controller; in society billions of individuals interact with each other through intricate trustfamilyfriendshipprofessionalassociation based networks apparently controlled by no one; economic change is driven by what economists call the “invisible hand of the market”, reflecting a lack of understanding of the control principles that govern the interactions between individuals, companies, banks and regulatory agencies.
These and many other examples raise several fundamental questions: What are the control principles of complex systems? How do complex systems organize themselves to achieve sufficient control to ensure functionality? This proposal is motivated by the hypothesis that the architecture of many complex systems is driven by the system’s need to achieve sufficient control to maintain its basic functions. Hence uncovering the control principles of complex selforganized systems can help us understand the fundamental laws that govern them.
The Integrative Genomics of Acute Asthma Control
Lead PI
Co PI
Benjamin Alexander Raby
Frank D. Gilliland
Funding
The Integrative Genomics of Acute Asthma Control
Using the Asthma BioRepository for Integrative Genomic Exploration (Asthma BRIDGE), we will perform a series of systemslevel genomic analyses that integrate clinical, environmental and various forms of “omic” data (genetics, genomics, and epigenetics) to better understand how molecular processes interact with critical environmental factors to impair asthma control.
The overarching hypothesis of this proposal is that interindividual differences in asthma control result from the complex interplay of both environmental, genomic, and socioeconomic factors organized in discrete, scalefree molecular networks. Though strict patient compliance with asthma controller therapy and avoidance of environmental triggers are important strategies for the prevention of asthma exacerbation, failure to maintain control is the most common healthrelated cause of lost school and workdays. Therefore, better understanding of the molecular underpinnings and the role of environmental factors that lead to poor asthma control is needed. Using the Asthma BioRepository for Integrative Genomic Exploration (Asthma BRIDGE), we will perform a series of systemslevel genomic analyses that integrate clinical, environmental and various forms of “omic” data (genetics, genomics, and epigenetics) to better understand how molecular processes interact with critical environmental factors to impair asthma control. This proposal consists three Specific Aims, each consisting of three investigational phases: (i) an initial computational discovery phase to define specific molecular networks using the Asthma BRIDGE datasets, followed by two validation phases – (ii) a computational validation phase using an independent clinical cohort, and (iii) an experimental phase to validate critical molecular edges (genegene interactions) that emerge from the defined molecular network.
In Specific Aim 1, we will use the Asthma BRIDGE datasets to define interactome submodule perturbed in poor asthma control;the regulatory variants that modulate this asthmacontrol module;and to develop a predictive model of asthma control.
In Specific Aim 2, we will study the effects exposure to air pollution and environmental tobacco smoke on modulating the asthma control networks, testing for environmentdependent alterations in network dynamics.
In Specific Aim 3, we will study the impact of inhaled corticosteroids (ICS – the most efficacious asthmacontroller medication) on network dynamics of the asthmacontrol submodule by comparing network topologies of acute asthma control between subjects taking ICS to those not on ICS. For our experimental validations, we will assess relevant genegene interactions by shRNA studies bronchial epithelial and Jurkat T cell lines. Experimental validations of findings from Aim 2 will be performed by cotreating cells with either cigarette smoke extract (CSE) or ozone. Similar studies will be performed with cotreatment using dexamethasone to validate findings from Aim 2. From the totality of these studies, we will gain new insights into the pathobiology of poor asthma control, and define targets for biomarker development and therapeutic targeting.
Public Health Relevance
Failure to maintain tight asthma symptom control is a major healthrelated cause of lost school and workdays. This project aims to use novel statistical networkmodeling approaches to model the molecular basis of poor asthma control in a wellcharacterized cohort of asthmatic patients with available genetic, gene expression, and DNA methylation data. Using this data, we will define an asthmacontrol gene network, and the genetic, epigenetic, and environmental factors that determine interindividual differences in asthma control.