DEPT. of MIS, National Chengchi University


Research Projects

Binflow: Static Flow and String Analysis of iOS Mobile Applications

Millions of mobile apps have been published in Apple's AppStore with more than 15 billion downloads by iOS devices. In order to protect iOS users from malicious apps, Apple has strict policies which are used to eliminate apps before they can be published in the AppStore. We propose flow and string analysis techniques for iOS executables for statically checking security development policies, particularly for those that are related to dynamically loaded classes. In order to check that an app conforms to such a policy, it is necessary to determine the possible string values for the class name parameters of the functions that dynamically load classes. The first step of our approach is to construct the assembly for iOS executables using existing tools. We then extract flow information from the assembly code and construct control flow graphs (CFGs) of functions. We identify functions that dynamically load classes, and, for each parameter that corresponds to a dynamically loaded class, we construct a dependency graph that shows the set of values that flow to that parameter. Finally, we conduct string analysis on these dependency graphs to determine all potential string values that these parameters can take, which identifies the set of dynamically loaded classes. Taking the intersection of these values with patterns that characterize Apple's app security development policies, we are able to detect potential policy violations.

This is the MOST granted project:106-2221-E-004 -002 - (PI: Fang Yu, NT 578,000, 2017/8-2018/7).

VISO: Securing Open Source Cloud Platforms via Virtualization Introspection and Artificial Intelligence

Cloud systems have rapidly replaced personal computers and become the dominant platform nowadays. Security threats could be one of the major stunning blocks on this evolution road. While system vendors and cloud tenants are benefit from sharing resources in the cloud environment, security breaches may cause worse damages of the cloud ecosystem than personal computers. We propose to investigate a virtualization introspection system on Openstack for securing open-source cloud platforms.  By using artificial intelligence techniques and automata theory to analyze behaviors of virtual machines, we aim at a formal and systematic approach to enhance cloud security, achieving mandatory, transparency and precision on defense mechanisms.

This is the MOST granted project:103-2221-E-004 -006 -MY3 (PI: Fang Yu, NT 1881,000, 2014/8-2017/7). [HICSS 2014, IJCNN 2014, IJCNN 2015, CloudCom 2015, IEEE Cloud 2016, IEEE BigData 2017, IEEE DataCom2017]

AppBeach: Revealing Behaviors of Mobile Applications via Static Binary Analysis

We investigate static binary analysis techniques on iOS executable with the aim of revealing suspicious behaviors that are embedded (but may not be shown to users) in mobile applications. The main idea is counting (or resolving) explicit (implicit) sensitive function calls in executable. We decrypt online apps and use IDA Pro as a disassembler to reverse engineering the binary, and as a script runner to extract embedded information in executable. Particularly, we use underlying system function calls to characterize application behaviors. We also build a pattern library of user-sensitive functions (such as access private information, connect to ftp cites, change external web page contents), and by matching these patterns against resolved system call functions, we are able to reveal behaviors of iOS apps, and further identify suspicious ones. We develop the tool AppBeach (App Behavior Checker) and reveal behaviors of hundreds of apps that are directly downloaded from the Apple app store.

This is the NSC granted project: 102-2221-E-004 -002 - (PI: Fang Yu, NT 560,000, 2013/8-2014/7).  [MS 2013, IJCNN 2013, CCISA 2015, ICWS 2016]

This project is also partially funded byTWISC.

Patcher: Vulnerability Detection and Patch Synthesis with Symbolic String Analysis

We investigate symbolic string verification techniques with the aim of developing a formal approach for automatic detection and removal of string related vulnerabilities in Web applications. The main topics include: sanitization synthesis, abstraction, and scalable service development.

This is the NSC granted project (PI: Fang Yu, NT1600,000 , 2010/11- 2013/7). [ISSTA 2016, CAV 2016, HICSS 2014, FMSD 2014, IJFCS 2011, ICSE 2011, SPIN 2011]

ICT-enabled Services: Service Quality and Service Innovation

The purpose of this project is to facilitate a high-standard service quality of cloud-based innovation services. One of our recent results addresses formal quantitative analysis on a distributed service framework. Its queuing models for the cloud-based streaming services lead to closed form expressions for service quality of system and the corresponding computer resource required. A simulation procedure is also proposed to catch the dynamics of the presented cloud-based streaming services, estimating operation characteristics such as lag time and interruptions that a customer may experience under different environment settings. Another direction is to develop innovative services and mobile applications. We are particularly interested in location based services and 3D immersion.

This work is partially funded by Service Innovation and E-commerce of  NCCU Top University Project. (PI: Fang Yu, NT289,000 , 2013/1- 2013/12). [IS 2014, SCC 2013, IS 2012]

Logics and Verification of Concurrent Systems

We investigate testing, simulation and verification techniques for symbolic consistency checking of multi-threaded programs. We also investigate strategy logics on games. This is joint work with the verification lab@NTU EECS under the supervision of Prof. Farn Wang. Specifically, we provide a new symbolic simulator that combines constraint solving with guided simulation strategies to find witnesses of program inconsistency. Another research direction joint with Prof. Chao is the theoretical study on Petri nets that have been widely used to model flexible manufacturing systems. we present new algorithms for synthesizing effective control policies and estimating reachable states without conducting expensive reachability analysis on a sub class of Petri nets.  

[CONCUR 2011, ACM SIGPLAN Notices (LCTES 2012), IETCTA 2013, CASE 2013, ISIE 2012]

Industrial Projects

Adaptive Social Network: The practice of 9EMBA.com

This social network research project, collaborated with Prof. Yang and Ribo Tech, is funded by SayLing Wen Cultural and Education Foundation. (PI: Fang Yu, NT 2866,000, 2016/3-2017/2). [IEEE EDGE 2017]

LightFarm: Intelligent Green IOT Application Development

The lightfarm project is collaborated and funded by Wieson Tech. (PI: Fang Yu, NT 1666,000, 2015/6-2017/6). [IEEE ICIOT 2017]

Text Mining and Cloud Computing

We investigate automatic web data crawling and knowledge extraction techniques under the private/public clouds. This is a joint TSMC granted project with Prof. Eugene Huang and Prof. Wenqing Liu, NCCU MIS. (PI: Eugene Huang, Co-PIs: Wenqing Liou and Fang Yu, 2010/10-2011/9).

Prior to NCCU

Supervisor: Prof. Tevfik Bultan
Coworker: Prof. Oscar H. Ibarra, Marco Cova, Muath Alkhalaf

String analysis is a static analysis technique that determines the string values that a variable can hold at specific points in a program. We develop an automata-based string analyzer, called Stranger (STRing AutomatoN GEneratoR), for real-world web applications. Stranger takes advantage on symbolic representation of automata, which allows the practical handling of automata on very large alphabets. Stranger also incorporates various novel approximation techniques and targets on proving the correctness of secured web applications. We later extend this approach to systems having both string and integer variables. By exchanging length information between string and arithmetic automata, we improve precision of both string analysis and size analysis. We incorporate backward analysis to generate vulnerability signatures to characterize malicious inputs. We also investigate the essence of string systems, and characterize the decidability of reachability analysis on various string systems. In one of our recent work, we use multi-track automata to generate effective patches for vulnerable applications. Stranger is implemented on top of Pixy and Mona projects. Stranger and benchmarks can be downloaded from here. [SPIN 2008, TACAS 2009, ASE 2009, TACAS 2010, CIAA 2010, IJFCS 2011]

2. VeriBPEL
Supervisor: Dr. Chao Wang

We propose a novel method for modular verification of BPEL service composition. We first use symbolic fixpoint computation to derive relations of the incoming and outgoing messages for well-defined web services based on reachability analysis. The relations are collected and serve as a repository of summarizations of individual web services. We then compose the summarization of external invocations, resulting in scalable verification of web service composition. We realize our ideas in VeriBPEL. The technical features include (1) a novel symbolic encoding for modeling the concurrency semantics of systems having both internal multi-threading and external message passing, and (2) a method for summarizing concurrently running processes, along with a modular verification framework that utilizes these summaries for scalable verification. [SIGSOFT/FSE2008]

3. Size Analysis on Object Constraint Language
Supervisor: Prof. Tevfik Bultan
Coworker: Erik Petterson

We aim to verify specifications of object oriented systems based on infinite state model checking techniques. We target on object constraint language and propose size abstraction and size analysis. We conducted a case study on the OCL specifications of Java Card APIs. The experiments indicate our abstraction is precise enough to verify/falsify target systems, while coarse enough to perform complex model checking techniques efficiently. [ESEC/FSE2007]

4. Characterizing Spiking Neural P Systems
Supervisor: Prof. Oscar H. Ibarra
Coworker: Sara Woodworth

Neurons are arguably one of the most interesting cell-types in the human body. A large number of neurons working in a cooperative manner are able to perform tasks that are not yet matched by the tools we can build with our current technology. Spiking Neuron P-systems (SNPs) incorporate ideas from spiking neurons into membrane computing. In this project, we give characterizations of sets definable by partially blind multicounter machines in terms of k-output SNPs operating in a sequential mode. [UC2006, NC2008]

5. SATRepair
Supervisor: Prof. Der-Tsai Lee and Prof. Sy-Yen Kuo
Coworker: Yaw-Wen Huang, Chung-Hung Tsai, and Hung-Yaw Lin

Fabricating large memory and processor arrays is subject to physical failures resulting in yield degradation. The strategy of incorporating spare rows and columns to obtain reasonable production yields was first proposed in the 1970s, and continues to serve as an important role in recent VLSI developments. Since the spare allocation problem (SAP) is NP-complete but requires solving during fabrication, an efficient exact spare allocation algorithm has great value. In this project, we proposed a novel Boolean encoding to reduce exact SAP to the satisfiablity problem, so that we can leverage the capability of modern SAT solvers. [DFT2005]

6. xBMC
Supervisor: Dr. Bow-Yaw Wang

We propose a new SAT-based model checking algorithm to solve the reachability problem of real-time systems. In our algorithm, the behavior of region automata is encoded as Boolean formulas, and hence any SAT solver can be used to explore the region graph efficiently. Although our SAT-based algorithm performs better than other algorithms in flaw detection, it is less effective inproving properties. To overcome the problem, we incorporate a complete inductive method in our algorithm to improve the performance when the property is satisfied. We implement both algorithms in a tool called xBMC and report experimental results. The experiments show that the combination of efficient encoding and inductive methods offers an effective and practical method for the analysis of timing behavior. [FORMATS/FTRTFT2004, ATVA2004, IJFCS2006]

Supervisor: Prof. Der-Tsai Lee and Prof. Sy-Yen Kuo
Coworker: Yaw-Wen Huang, Chung-Hung Tsai, and Christine Hang

WebSSARI stands for Web application Security via Static Analysis and Runtime Inspection. Viewing Web application vulnerabilities as a secure information flow problem, we created a lattice-based static analysis algorithm derived from type systems and typestate. During the analysis, sections of code considered vulnerable are instrumented with runtime guards, thus securing Web applications in the absence of user intervention. Soundness, i.e no false negative, is achieved by applying formal methods including typestate checking and bounded model checking. I worked on designing and developing core checkers including: (1) A light-weight Type Checker, and (2) A heavy-weight Bounded Model Checker. [WWW2004][DSN2004]

Here is a related link. Parts of the techniques have been commercialized by Armorize Technologies, Inc..

8. RED 4.0
Supervisor: Prof. Farn Wang
Coworker: Geng-Dian Huang

I was involved in designing and developing RED 4.0 on top of RED, the full Timed-CTL symbolic model checker initiated by Prof. Farn Wang in 1996. RED 4.0 enhanced RED in the following directions: (1) Symbolic Simulation, (2) Numerical Coverage Estimation, (3) Greatest Fixpoint Computation and (4) Symmetric Analysis with Pointer Data Structure. [RTCSA2003][CIAA2003, FORTE2003, TSE2004, TSE2006]

9. TimeC
Supervisor: Prof. Farn Wang

TimeC is a C-like language that supports basic C statements, timed statements, e.g., duration and switch-event, and Open Verification Library(OVL) assertions. I developed an optimized translator that automatically translates programs written in TimeC into timed automata and TCTL formula in the input format of RED. Integrating the translator as the front end of RED, we can automatically verify real-time systems specified in TimeC. [RTCSA2003, JEC2004]