My research mostly lies at the intersection of Data Management and Distributed Systems. The focus of my current research is on managing data in modern infrastructures like cloud and blockchain. I'm also interested in Business Process Management.
Cloud environments and in general distributed data management systems utilize State Machine Replication to provide fault tolerance and to enhance performance. Fault-tolerant protocols are extensively used in the distributed database infrastructure of large enterprises such as Google, Amazon, and Facebook. However, and in spite of years of intensive research, existing fault-tolerant protocols do not adequately address hybrid cloud environments consisting of private and public clouds which are widely used by enterprises. To address this issue, we have developed SeeMoRe, a hybrid fault-tolerant protocol that uses the knowledge of where crash and malicious failures may occur in a public/private cloud environment to improve overall performance. In SeeMoRe, we consider a private cloud consisting of trusted replicas, a subset of which may fail-stop, and a public cloud where a subset of the replicas may behave maliciously. SeeMoRe takes explicit advantage of this knowledge to improve performance by reducing the number of communication phases and messages exchanged and/or the number of required replicas.
Fault-tolerant protocols are also widely used in blockchain systems to establish consensus on the order of blocks. A blockchain is a distributed data structure for recording transactions maintained by several nodes without a central authority. In a blockchain, nodes agree on their shared states across a large network of untrusted participants. Blockchain was originally devised for Bitcoin cryptocurrency, however, recent systems focus on its unique features such as transparency, provenance, fault tolerance, and authenticity to support a wide range of distributed applications. In this line of research, we have focused on permissioned blockchains and the development of techniques that will make them practical in real-life settings. These techniques address the challenges regarding the performance, scalability, and confidentiality properties of permissioned blockchains.
Distributed applications require high performance in terms of throughput and latency, e.g., financial applications need to process tens of thousands of requests every second with very low latency. Existing permissioned blockchains, mostly employ an order-execute paradigm where nodes agree on a total order of the blocks of transactions using a consensus protocol and then the transactions are executed in the same order on all nodes sequentially. Such a paradigm suffers from performance issues because of the sequential execution of transactions on all nodes. While recent permissioned blockchain systems, e.g., HyperLedger Fabric, have tried to overcome this limitation, their focus has mainly been on workloads with no-contention, i.e., no conflicting transactions. We have designed ParBlockchain to support workloads with (different degrees of) contention. ParBlockchain uses a dependency-graph-based concurrency control technique to detect possible conflicts between transactions and to ensure the valid execution of transactions while still allowing non-conflicting transactions to be executed in parallel.
Besides performance, scalability is one of the main obstacles to the business adoption of blockchain systems. Scalability is the ability of a blockchain system to process a growing number of transactions by adding resources to the system. Despite recent intensive research on using sharding techniques to enhance the scalability of blockchain systems, existing solutions do not efficiently address cross-shard transactions. To address this issue, we have developed SharPer, a permissioned blockchain system that enhances the scalability of blockchain systems by clustering (partitioning) the nodes and assigning different data shards to different clusters. In SharPer, the blockchain ledger is formed as a directed acyclic graph where each cluster maintains only a view of the ledger. SharPer also incorporates a flattened protocol to establish consensus among clusters on the order of cross-shard transactions where intra-shard transactions of different shards as well as cross-shard transactions with non-overlapping shards can be processed in parallel.
In addition to performance and scalability, confidentiality of data is required in many permissioned blockchains. Distributed applications collaborate with each other following Service Level Agreements (SLAs) to provide different services. A blockchain system needs to support both internal and cross-application transactions of collaborating distributed applications. While collaboration between applications, e.g., cross-application transactions, should be visible to all applications, the internal data of each application, e.g, internal transactions, might be confidential. To address this issue, we have developed Caper, a permissioned blockchain system to support both internal and cross-application transactions of collaborating distributed applications. Caper introduces three consensus protocols to globally order cross-application transactions between applications with different internal consensus protocols.
Business Process Management.
Business processes (workflows) are typically the compositions of services (activities and tasks) and play a key role in every enterprise. We have studied two main problems of workflow verification and workflow similarity to address data management in business processes. Business processes need to be changed to react quickly and adequately to internal and external events. Moreover, each business process is required to satisfy certain desirable properties such as soundness, consistency, or some user-defined linear temporal logic (LTL) constraints. We have proposed a technique to incrementally check and verify the constraints of evolving business processes. Furthermore, we have developed VIEW, a framework to model, change, and VerIfy Evolving Workflows. Besides verification, Finding similar processes in process repositories helps enterprises to reduce their costs and increase their performance. To this end, we have proposed an approach to measure the similarity of business processes based on the similarity of their data objects. Service identification, process-driven test case generation, and validation of service choreography are some of the other problems that we have studied in the BPM domain.