Student Projects

Thesis topic information

These projects are available for Master theses, and could be tailored for Bachelor theses. If you would like to do a thesis with me, but you have not found here anything particularly exciting, let me know and we will try to find a topic suiting your interests.

Android security

Android is a booming eco-system with zillions of third-party apps, many app markets, various devices and multiple platform versions. With a high probability you yourself own (at least) one Android device. Why do not spend your Master thesis work pondering about Android security, and trying to improve it?

Below are some tentative thesis topics. In the SaToSS group, we have everything you need to start investigating Android: app datasets, devices, and tools. Android is not as complex as you might think of it. Usually, you will need to learn to install apps on a device/emulator, you will need to run and write relatively small Python programs, and you will need to understand some basic Machine Learning approaches.

Summary of these projects as pdf with references.

Topic 1: Resource-based repackaging detection

Android apps are sources of revenue for their developers, yet it is very easy to plagiarize a third-party app by repackaging it. In this thesis you will design a new scheme for detecting repackaged Android apps by using resource files included in the packages. Resource files, such as images, strings, xml layouts, have demonstrated their potential in detecting cloned apps. Subsequent experiments have shown that particular resource file types can serve as better indicators of repackaging. In your thesis, you will focus on further improvements of the method. The improvements can be in the direction of robustness (currently it is very easy for the adversaries to slightly modify the resource files so that the method does not recognize them as identical); scalability (improving the performance by moving from pair-wise app comparison to search of the nearest neighbours in some ordered space); or you may focus on developing a hybrid approach that will fuse the resource-based detection with some code-based repackaging detection technique.

Topic 2: Dataset building

One of the most challenging tasks in doing Android security is to collect the right dataset to validate the developed approach. In your thesis you will work on collecting a dataset of third-party apps to share with the community. The dataset will be focused on a particular task: repackaged app detection (a set of confirmed repackaged and non-repackaged app pairs); evolution of Android apps (we want to collect many last-generation apps and check how do they cope with the recent changes in the Android platform architecture); or malware detection (a representative set of recent malware samples). Dataset collection typically involves crawling apps from app markets, and querying different online services (e.g., VirusTotal).

Topic 3: App code analysis for anomaly detection

This thesis will focus on applying static and dynamic analysis tools to Android apps in order to detect anomalies (e.g., malicious or buggy behaviors). Some theoretical work can also be considered (developing of a semantic model of Android apps expressed as a graph or a state machine).

Topic 4: Exploring dynamic app models for malware and repackaging detection

The available scientific literature on malware and repackaging detection often utilizes statically constructed app models (different graphs or graph-derived features, or even code n-grams). However, obfuscation and dynamic code loading hinder application of statically-constructed models (not all code is available at the analysis time). In this thesis you will explore whether dynamically built models could be successfully applied for Android malware and repackaging detection. You will design a dynamic app model (e.g., a graph), will assess its performance, and will investigate the cost-benefit analysis of using such models (as they will require more time to create, and they will also be incomplete).

Topic 5: App runtime performance assessment

Many research tools introduce runtime overhead due to addition of ancillary code, but there is currently no good way to quantify the impact of such tools on runtime performance. You will investigate the Android SDK profiler and other available approaches to measure performance of Android apps and will assess performance penalties of different tools.

Topic 6: Code coverage for automated testing

Code coverage is an important metric for evaluating how well an application has been tested. This is especially critical for third-party Android apps being analyzed for security and reliability purposes. Our group has recently developed ACVTool, a new tool to measure code coverage in black-box app testing, and we have many ideas how to extend the tool and how to apply it in the automated testing pipelines. In this thesis project you can focus on 1) extending ACVTool by adding more coverage metrics; or 2) on using ACVTool in an automated testing pipeline and dynamically improving the testing results by considering the coverage achieved so far.

Towards Automated Risk Management

Risk assessment (threat analysis) is traditionally performed by a group of human analysts (think consulting companies that charge per hour) by brainstorming about potential threats to the organization. This activity produces incomplete results, because humans are not able to take into account all possible scenarios. Thus lately security researchers, including our group, started to work on automated risk assessment techniques, in which threats and potential attacks are identified automatically from some system model.

Risk management activity also includes risk treatment: identification of countermeasures that need to be introduced in the organization in order to reduce risks to acceptable levels. In this thesis you will make research in the general area of automated risk management with attack trees. Possible topics for a Master thesis are below.

Summary of these projects as pdf with references.

Topic 1: Automated assignment of countermeasures into an attack tree

Assume that security analysts have designed an attack tree characterizing existing attacks for an organization. For such a tree there exist several approaches to identify the most critical attack scenarios (based on parameters important for the attacker, such as cost or time, or parameters important for the defender, such as impact). Given the set of the most severe attacks, we would like to automatically produce a set of countermeasures thwarting these attacks.

You will work on identifying a plausible approach for automated preventive security controls selection. This approach will likely require a knowledge base/an ontology that will capture applicable countermeasures for each attack type (we can start from, e.g., CAPEC). Given such a knowledge base, you will design an algorithm to select countermeasures based on some chosen metrics (e.g., risk leverage, impact reduction, likelihood reduction). You will also need to investigate how to accommodate the selected countermeasures in the original attack tree (thus yielding a correct attack-defence tree). The overall approach will be implemented as a prototype tool and integrated with the ADTool format.

Topic 2: Comparison of automatically generated attack trees versus manually designed

Recently several tools emerged that aim at automated construction of attack scenarios expressed as attack trees. These tools however produce ``flat trees'', i.e., they do not structure the attack scenarios in some abstract way. Human analysts instead aim at establishing categories of attacks, with more abstract attack steps appearing closer to the root of the tree. In this thesis you will study existing methodologies for automated and manual design of attack trees, and will propose a taxonomy of attack tree properties that will bridge the gap between ``flat'' and ``abstract'' attack trees.

Topic 3: Sensitivity analysis on attack-defense trees

Assume that security analysts have designed a comprehensive attack-defense tree representing the existing attacks and already existing controls for an organization. Given this tree, quantitative analysis for various attributes (time, cost, probability of success, impact of an attack) can be performed in the ADTool. Sensitivity analysis} is a method for experimenting with different attribute values to identify critical paths in the tree. In the nutshell, the analyst tries to establish how variance in some attribute values affects the value for the root node.

The goal of this thesis will be to establish a methodology for sensitivity analysis on attack-defense trees. For example, if the analyst goal is to establish the place to introduce a new security control, what process does she need to follow? This methodology will be implemented as a prototype tool and integrated with the ADTool format.

Topic 4: Cyber-insurance propositions modeling with attack trees

Cyber-insurance is a recent solutions for many companies to share risks. These companies may rely on already available internal risk assessment results to understand the cost/benefit ratio of various cyber-insurance propositions.

The first goal of this thesis is to review the cyber-insurance products available in Luxembourg, and to relate them to attacks that can be expressed in attack trees. Then you will develop a methodology to assess costs and risk exposure for the insured and the insurer sides that will integrate insights from attack tree-based quantitative and qualitative analyses.