SRM seminar

Abstracts

Speaker: Yang Zhang
Title: Language in Our Time: An Empirical Analysis of Hashtags
Abstract: Hashtags in online social networks have gained tremendous popularity during the past five years. The resulting large quantity of data has provided a new lens into modern society. Previously, researchers mainly rely on data collected from Twitter to study either a certain type of hashtags or a certain property of hashtags. In this paper, we perform the first large-scale empirical analysis of hashtags shared on Instagram, the major platform for hashtag-sharing. We study hashtags from three different dimensions including the temporal-spatial dimension, the semantic dimension, and the social dimension. Extensive experiments performed on three large-scale datasets with more than 7 million hashtags in total provide a series of interesting observations. First, we show that the temporal patterns of hashtags can be categorized into four different clusters, and people tend to share fewer hashtags at certain places and more hashtags at others. Second, we observe that a non-negligible proportion of hashtags exhibit large semantic displacement. We demonstrate hashtags that are more uniformly shared among users, as quantified by the proposed hashtag entropy, are less prone to semantic displacement. In the end, we propose a bipartite graph embedding model to summarize users’ hashtag profiles, and rely on these profiles to perform friendship prediction. Evaluation results show that our approach achieves an effective prediction with AUC (area under the ROC curve) above 0.8 which demonstrates the strong social signals possessed in hashtags.

Speaker: Christian Majenz
Title: Can you sign a quantum state?
Abstract: Cryptography with quantum states exhibits a number of surprising and counterintuitive features. In an intriguing 2002 paper, Barnum et al. argued that these strange features imply that digital signatures for quantum states are impossible. In this work, we thoroughly explore this question from a theoretical crypto perspective. We expand on the work of Barnum et al. and show that even very weak forms of signing quantum states are impossible; essentially, if a signature scheme is secure, then it is classical. We then show a positive result: it is possible to sign quantum states, provided that they are also encrypted with the public key of the intended recipient. Following classical nomenclature, we call this notion quantum signcryption. Classically, signcryption is only interesting if it provides superior efficiency to simultaneous encryption and signing. Our results imply that, quantumly, it is far more interesting: by the laws of quantum mechanics, it is the only signing method available. We develop security definitions for quantum signcryption, ranging from a simple one-time two-user setting, to a chosen-ciphertext-secure many-time multi-user setting. We also give secure constructions based on post-quantum public-key primitives. (Joint work with Gorjan Alagic and Tommaso Gagliardoni)

Speaker: Hüseyin Demirci
Title: The Next Generation Sequencing Era: Applications, Challenges and Threats
Abstract: In this talk we will briefly discuss the Next Generation Sequencing (NGS) technology which revolutionizes life science research and enables many technological applications. After giving a short background of the technology, we will point out some practices like large scale genomics projects and use cases in medicine and biotechnology. We will demonstrate the procedure of identifying disease causing mutations among thousands of candidates and discuss the motivation of the developed NGS data analysis tools which facilitate this process. We will present the results obtained from a 1100 genome sequencing project and discuss the lessons learnt. The structure of NGS data and the analysis pipeline will be emphasized. We will provide instances of challenges arising from this structure. Later, we will briefly refer to genomic privacy issues threatened by this revolution and indicate some ethical and socio-technical problems. Finally, we will exemplify security related problems and point out the proposed alternative solutions for these problems.

Speaker: Matteo Acclavio
Title: Combinatorial proofs for modal logic
Abstract: In this talk I will present combinatorial proofs for modal logic. I will show how to encode modal formulas by means of graphs and how it is possible to check whether the formula represented by the graph is linearly provable (i.e. without weakening and contraction) in the logic K. Then, I define a family of morphisms between graphs which allows to give a sound and complete proof system for all the modal logics of the S4 plane.

Speaker: Lutz Strassburger
Title: Combinatorial proofs and their normalization
Abstract: In this talk I will give an introduction to combinatorial proofs for classical propositional logic and discuss their normalization. I will also discuss the substitution of proofs into proofs. If time permits, I will also present intuitionistic combinatorial proofs.

Speaker: Cheng-Te Li
Title: Learning Feature Representation in Heterogeneous Information Networks
Abstract: Network embedding (i.e., network feature representation) is a powerful technique for typical machine learning tasks in social and information networks, such as link prediction and node classification. Network embedding aims at learning an effective vector transformation so that nodes in a network can be projected into a low-dimensional feature space. State-of-the-art network embedding methods mainly focus on homogeneous networks, i.e., nodes and links are of the same types. We have developed novel embedding learning algorithms in heterogeneous information networks that contain different types of nodes and links. The goal is to let the learned embedding preserve not only graph proximity between nodes but also relational and attributed information associated on nodes and links. Based on heterogeneous network embedding learning, in this talk, I will present how to boost the performance of link prediction and node classification in both homogeneous and heterogeneous realms. I will also demonstrate the promising prediction power of our embedding learning models by the experiments conducted on six real-world network datasets.

Speaker: Yang Zhang
Title: Updates-Leak: Data Set Inference and Reconstruction Attacks in Online Learning
Abstract: Machine learning (ML) has progressed rapidly during the past decade and the major factor that drives such development is the unprecedented large-scale data. As data generation is a continuous process, this leads to ML service providers updating their models frequently with newly-collected data in an online learning scenario. In consequence, if an ML model is queried with the same set of data samples at two different points in time, it will provide different results. In this talk, I will cover our newest results on investigating whether the change in the output of a black-box ML model before and after being updated can leak information of the dataset used to perform the update. This constitutes a new attack surface against black-box ML models and such information leakage severely damages the intellectual property and data privacy of the ML model owner/provider. In contrast to membership inference attacks, we use an encoder-decoder formulation that allows inferring diverse information ranging from detailed characteristics to full reconstruction of the dataset. Our new attacks are facilitated by state-of-the-art deep learning techniques. In particular, we propose a hybrid generative model (BM-GAN) that is based on generative adversarial networks (GANs) but includes a reconstructive loss that allows generating accurate samples. Our experiments show effective prediction of dataset characteristics and even full reconstruction in challenging conditions.

Speaker: Sergiu Bursuc
Title: Secure evoting and fair exchange with minimal trust
Abstract: Cryptographic protocols are modeled by transition rules, also called oracles, representing (symbolic or computational) protocol execution in an adversarial environment. Oracles can model fine-grained trust assumptions about the infrastructure and parties involved in the protocol, allowing adversarial influence or leakage from computations. We consider two case studies: electronic voting - with no single point of trust, and public ledger protocols - where we only trust smart contract code on the blockchain. We design oracles and security specifications for these applications, show how they can be analyzed with automated tools, and what challenges remain for protocol design and formal methods.

Speaker: Jintai Ding
Title: Post-quantum key exchange and recent progress
Abstract: In this talk, we will present first the basic post-quantum key exchange based on LWE and RLWE. Then we will present some recent progress on key reuse attacks and authenticated key exchanges.

Speaker: Yury Zhauniarovich
Title: Characterizing Bitcoin Donations to Open Source Software on GitHub
Abstract: Web-based hosting services for version control, such as GitHub, have made it easier for people to develop, share, and donate money to software repositories. In this paper, we study the use of Bitcoin to make donations to open source repositories on GitHub. In particular, we analyze the amount and volume of donation over time, in addition to its relationship to the age and popularity of a repository. We scanned over three million repositories looking for donation addresses. We then extracted and analyzed their transactions from Bitcoin's public blockchain. Overall, we found a limited adoption of Bitcoin as a payment method for receiving donations, with nearly 44 thousand deposits adding up to only 8.3 million dollars in the last 10 years. We also found weak positive correlation between the amount of donation in dollars and the popularity of a repository, with highest correlation (r=0.013) associated with number of forks.

Speaker: Markus Dürmuth
Title: Passwords are dead - What's next?
Abstract: There is growing discomfort about password-based authentication, and passwords are declared "dead" on a regular basis. However, passwords are still the most widely used tool for user authentication on the Internet. In this talk we will have a closer look on the current state of user authentication, and will discuss several variants to strengthen user authentication. One specifically interesting approach is to use additional signals such as source IP, geo-location, or browser configuration. These allow a service to estimate the risk of a malicious login and to take appropriate countermeasures. This requires minimal changes to the user experience, and is used in practice by several services.

Speaker: Chloe Martindale
Title: CSIDH: an efficient post-quantum commutative group action
Abstract: CSIDH, or 'commutative supersingular isogeny Diffie-Hellman' is a new isogeny-based protocol of Castryck, Lange, Martindale, Panny, and Renes. The Diffie-Hellman style scheme resulting from the group action allows for public key validation at very little cost, runs reasonably fast in practice, and has public keys of only 64 bytes at a conjectured AES-128 security level, matching NIST’s post-quantum security category I. For comparison, the SIDH (and SIKE) isogeny-based cryptosystems are faster than CSIDH, but they do not support non-interactive key exchange, and their public keys and ciphertexts are 6 times larger than in CSIDH. We will describe the CSIDH protocol, give an overview of the security analysis, and outline some potential applications.

Speaker: Wei Guo
Title: Second-Child Birth Intentions of China's Floating Population: Influences from Socioeconomic Factors and Home Ownership
Abstract: The adoption of the universal two-child policy in late 2015, replacing the one-child policy, signals a dramatic turnabout in China’s family planning policy. The Migrant Dynamics Monitoring Survey (MDMS) in 2016 provides a nationally representative dataset that enables us to employ a multilevel analytical framework to reveal the socioeconomic correlates of second-child birth intentions in China’s large population of migrants. We find that male, younger, and more affluent migrants, those from minority ethnic groups, individuals migrating from rural areas, those whose first child is a girl, and those living in economically less developed cities are more likely to express an intention to have a second child. Moreover, our paper establishes an empirical connection between home ownership and birth intentions. Migrants who own their own home in the destination city express lower intentions to have a second child, compared with those who are renting. We suggest a proposition about this counterintuitive coexistence of home ownership and lower birth intentions. Home ownership and childbearing compete for limited financial resources of migrants who are particularly disadvantaged in Chinese society.

Speaker: Richard Clayton
Title: Booters: who runs them and how might we make them stop?
Abstract: Booters (they call themselves "stressers") are websites selling extremely cheap DDoS attacks, mainly to the online gaming community who want to cheat at their shoot-em-ups by disrupting the opposition. The booters are also used, but far less often, for extortion, to disrupt school websites on the eve of a test, or just because someone doesn't like you. This talk presents some of our (published) research into who runs booters and why. It then discusses the FBI intervention last December when 15 booter domain names were seized and discuss what we have seen since both in terms of the number of DDoS attacks per day and how the whole activity is now being viewed by those who are involved.

Speaker: Mathias Ramparison
Title: Decision problems for parametric timed automata and application to attack-fault trees
Abstract: Several parametric extensions of timed automata (TAs) were proposed in the literature, and the vast majority of them leads to the undecidability of the EF-emptiness problem: “is the set of valuations reaching a given location empty?” First, we study an extension of TAs here clocks can be reset to a parameter. While the EF-emptiness problem is undecidable for rational-valued parameters, it becomes PSPACE-complete for integer-valued parameters. In addition, exact synthesis of the parameter valuations set can be achieved.
Furthermore, we introduce a new class of Parametric Timed Automata (PTAs) where we allow clocks to be compared to parameters in guards, as in classic PTAs, but also to be updated to parameters. Nonetheless, if we update all clocks each time we compare a clock with a parameter and each time we update a clock to a parameter, we obtain a syntactic subclass for which we can decide the EF-emptiness problem and even perform the exact synthesis of the set of rational valuations such that a given location is reachable.
Finally, we define and implement the translation of attack-fault trees (AFTs) to a new extension of timed automata, called parametric weighted timed automata. This allows us to parametrize constants such as time and discrete costs in an AFT and then, using the model-checker IMITATOR, to compute the set of parameter values such that a successful attack is possible. Using the different sets of parameter values computed, different attack and fault scenarios can be deduced depending on the budget, time or computation power of the attacker, providing helpful data to select the most efficient counter-measure.

Speaker: Tiansi Dong
Title: Imposing Category Trees Onto Word-Embeddings Using A Geometric Construction
Abstract: We present a novel method to precisely impose tree-structured category information onto word-embeddings, resulting in ball embeddings in higher dimensional spaces (N-balls for short). Inclusion relations among N-balls implicitly encode subordinate relations among categories. The similarity measurement in terms of the cosine function is enriched by category information. Using a geometric construction method instead of back-propagation, we create large N-ball embeddings that satisfy two conditions: (1) category trees are precisely imposed onto word embeddings at zero energy cost; (2) pre-trained word embeddings are well preserved. A new benchmark data set is created for validating the category of unknown words. Experiments show that N-ball embeddings, carrying category information, significantly outperform word embeddings in the test of nearest neighborhoods, and demonstrate surprisingly good performance in validating categories of unknown words.

Speaker: Isa Sertkaya
Title: Attacks and new proposals on e-checkbook schemes, and Vote privacy issue on Estonian i-Voting System
Abstract: This talk is comprised of two parts. The first part will open with the necessary definitions, e-check forgery and manipulation attacks on the e-checkbook schemes proposed by T.H. Chen et al. (2009), Chang et al. (2009) and C.L. Chen et al. (2010) will be described. Next, how to make these proposals secure will be presented. New schemes that promise mutual authentication and privacy enhancement will be given. In the second part, vote privacy leakage possibility on Estonian Internet Voting (i-Voting) system will be the focus. i-Voting system has being used since 2005. After the Estonian Parliamentary Elections held in 2011, an individual verification mechanism (VerApp) was integrated into the i-voting system because of the so-called Student Attack. The aim of VerApp is to give opportunity to the voters to verify (preferably on a different mobile device) that their vote is recorded in the central system as intended. This verification phase ends by displaying the cast vote in plain form on the verification device. After giving the details, the possibility of vote privacy leakage due to displaying the cast vote in plain form will be discussed.

Speaker: Ehsan Ebrahimi
Title: Post-Quantum Security in the Presence of Superposition Queries
Abstract: A large-scale quantum computer can break conventional public key encryption schemes. To address the challenge caused by a large-quantum computer, we need to retool the security infrastructures with post-quantum secure primitives. To achieve a post-quantum secure scheme, one needs to design a construction based on quantum-hard computational assumptions (E.g. Lattice) and then proves the security of the construction against a quantum adversary. The security proof may be a challenging task because classical security proof techniques may fail against a quantum adversary due to the strange properties of quantum mechanics. One example is the security proofs in the presence of adversarial superposition queries. In this talk, we investigate the security of many cryptographic constructions in the presence of superposition queries. This includes the post-quantum security of hash functions (indifferentiability and collision-resistance property), the post-quantum security of Fujisaki-Okamoto transformation that is a transformation from two weakly secure encryption schemes to an IND-CCA secure public-key encryption scheme in the Random Oracle Model, and the post-quantum security of modes of operations.

Speaker: Vivek Nigam
Title: Model-Based Safety and Security Engineering
Abstract: By exploiting the increasing surface attack of systems, cyber-attacks can cause catastrophic events, such as, remotely disable safety mechanisms. This means that in order to avoid hazards, safety and security need to be integrated, exchanging information, such as, key hazards/threats, risk evaluations, mechanisms used. In this talk, we describe some steps towards this integration by using models. We start by identifying some key technical challenges. Then we demonstrate how models, such as Goal Structured Notation (GSN) for safety and Attack Defense Trees (ADT) for security, can address these challenges. In particular, (1) we demonstrate how to extract in an automated fashion security relevant information from safety assessments by translating GSN-Models into ADTs; (2) We show how security results can impact the confidence of safety assessments; (3) We propose a collaborative development process where safety and security assessments are built by incrementally taking into account safety and security analysis; (4) We describe how to carry out trade-off analysis in an automated fashion, such as identifying when safety and security arguments contradict each other and how to solve such contradictions. We conclude pointing out that these are the first steps towards a wide range of techniques to support Safety and Security Engineering.

Speaker: Hyunwoo Lee
Title: maTLS: How to Make TLS middlebox-aware?
Abstract: Middleboxes are widely deployed in order to enhance security and performance in networking. As communication over TLS becomes increasingly common, however, the end-to-end channel model of TLSundermines the efficacy of middleboxes. Existing solutions, such as ‘SplitTLS’, which intercepts TLS sessions, often introduce significant security risks by installing a custom root certificate or sharing a private key. Many studies have confirmed security vulnerabilities when combining TLS with middleboxes, which include certificate validation failures, use of obsolete ciphersuites, and unwanted content modification. To address the above issues, we introduce a middlebox-aware TLS protocol (maTLS), which allows middleboxes to participate in the TLS session in a visible and auditable fashion. In this talk, we first focus on the problems of SplitTLS, especially in terms of authentication, confidentiality, and integrity. Then, we talk about how we make the middleboxes visible with introducing the notion of auditable middleboxes. Lastly, we see how the maTLS protocol resolves the issues with auditable middleboxes.

Speaker: Anton Feenstra
Title: Exhaustive simulation of epistatic patterns using Petri net models
Abstract: Non-linear efects of double gene deletions, i.e. genetic interactions (GIs) or epistasis, are crucial properties of biological systems. These non-linear interactions are characterized by the phenotype of interacting single and double mutated gene pairs. Uncovering the molecular mechanisms underlying GI patterns would provide a better understanding of their role in biological processes, diseases, and drug response. Many factors that shape the phenotype of two interacting genes, and little direct experimental data exists to derive a mechanism. Therefore, we set out to exhaustively explore all possible Petri net models containing four gene-nodes: two regulatory and two downstream. From this analysis, we show how we can make some interesting observations about the nature of molecular mechanisms that underlie the diferent genetic interaction patterns. Here, we focus on the GI pattern of ‘inversion’ as this is less well characterized. We furthermore describe the modelling and analysis framework we developed to enable to do this exhaustively.

Speaker: Mete Akgun
Title: Privacy-Preserving Processing of Genomic Data
Abstract: Diagnosis and treatment decisions on big genomic data have become widespread as the cost of genome sequencing decreases gradually. In this context, disease-gene association studies are of great importance. The genomic data is very sensitive when compared to other data types and contains information about the individual and his / her relatives. Many studies have shown that this information can be obtained from the query-response pairs on genomic databases. In this talk, we focus on privacy-preserving methods that allow multiple genomic databases to be safely stored in semi-honest cloud environments and to be queried at the same time. We also talk about methods that ensure the privacy of patients in disease-gene association studies. Lastly, we talk about privacy-preserving machine learning algorithms for analyzing genomic data.

Speaker: Jiangshan Yu
Title: RepuCoin: An extra resilient blockchain consensus system
Abstract: Existing proof-of-work based cryptocurrencies cannot tolerate attackers controlling more than 50% of computing power in the system at any time, but assume that such a condition happening is “unlikely”. However, recent facts and attacks render this assumption unrealistic. To address this issue, we propose RepuCoin, the first system providing guarantees even when more than 50% of the system’s computing power is temporarily dominated by an attacker. While providing better resilience to known attacks, RepuCoin also achieves a high throughput of 10,000 transactions per second.

Speaker: Damien Desfontaines
Title: Differential privacies: a taxonomy of semantic data privacy definitions
Abstract: Shortly after its introduction in 2006, differential privacy became the flagship of data privacy definitions. Since then, a large number of variants and extensions were proposed to adapt it to different scenarios and attack models. What are the motivations behind these variants? In which context should you use one or the other? In this talk, I'll present a way to systematize this body of research, propose a classification of semantic privacy definitions, and explain what are the challenges encountered along the way. This is ongoing research, done in collaboration with Balazs PEJO.

Speaker: Jintai Ding
Title: Quantum-Proof Blockchain
Abstract: Blockchain technology is now going through explosive development with the aim to build a new generation of revolutionary financial technology. The most successful example is new digital currency bitcoin. The fundamental building block in blockchain technology is actually cryptographic algorithms, which is why bitcoin is actually called a cryptocurrency. The main cryptographic algorithms used in blockchain technology are hash functions and elliptic curve digital signatures. As we all know, quantum computers are not such a significant threat to the security of Hash functions but it can be fatal to the elliptic curve digital signatures. In this presentation, we will first show how the quantum computers can threat the security of blockchain technology, in particular, why the existing blockchain technology used in bitcoins can not fundamentally avoid such a practical attack. Then we will explain the challenges we will face if if we just plug in existing post-quantum cryptographic solutions as a drop-in to replace the existing elliptic curve signatures, in particular, the key size problem and a few others. In the end, we will present some of the new solutions we have been developing to deal with these fundamental problems including a new type of proof of work algorithms, which, we believe, provide very viable solutions for the future long term secure blockchain technology.

Speaker: David Naccache
Title: Missing Species in the Digital Signatures Zoo?
Abstract: This talks describes new OWFs and signature schemes. The new signature schemes are interesting because 1. they are based on factoring and 2. they do not belong to the two common design blueprints which are the inversion of a trapdoor permutation and the Fiat-Shamir transform. The signature algorithms are derived from a new OWF whose inversion is as provably as difficult as factoring. By opposition to the DLP, Rabin or RSA, which assume that the target modulus is built into the OWF, the new OWF does not require any built-in parameters except the modulus' size. Given of their strangeness and very different design the new signature schemes seem to be an overlooked "missing species" in the corpus of known signature algorithms (based on arithmetics modulo p q^4, the produced signatures are primes, i.e. a sort of Australian platypus...) . We stress that despite the signature schemes' very simple description (one formula), we did not manage to prove their security nor find any attacks against them. We hence conjecture their security and invite the community to scrutinize them. Common work with Eric Brier (Ingenico) and Houda Ferradi (NTT Labs)

Speaker: Dennis Jackson
Title: Automated Analysis of Digital Signatures’ Unexpected Behaviours
Abstract: Automated protocol analysis tools, such as Tamarin and Proverif, have found a prominent role in analysing widely used security protocols such as TLS, 5G and Signal. However, these automated tools work in the "Symbolic Model", which abstracts and approximates the behaviour of cryptographic primitives. In this talk, we consider the symbolic approximation of digital signatures, which has remained unchanged since 2001. How well does it capture the properties of real signature schemes? We uncover a surprising mismatch between these symbolic approximations and real world signatures. There are a number of unusual and esoteric signature scheme properties which have been reported in the academic literature, are permitted by the traditional definition of signature scheme security (EUF-CMA), yet are not captured by contemporary symbolic analysis tools. This discrepancy is significant. We consider a series of protocols with known attacks and show that existing analysis tools incorrectly verify these protocols as being secure. We provide a new symbolic model of digital signatures which resolves this mismatch and can automatically find these attacks. Additionally, using our improved model we discover previously unknown attacks on real world protocols, including an attack on a protocol which was previously verified to be “secure”.

Back to SRM presentations.

Contact

For questions and comments contact Jun or Peter .