Comp527: Final Project Ideas

You are strongly encouraged to come up with your own ideas, of course, but here are some ideas to get you started thinking about your final projects.

Warning: the projects that, in past years, have succeeded were very focused. You don't have time to solve all the world's problems. Choose something narrow and leave yourself room to expand later.

Operating Systems

Linux Security Modules: A good way of approaching ACLs, MLS, or whatnot would be to do it in the context of the Linux kernel's new security architecture, which is supposed to make it easier to add these kinds of patches to the kernel. You might consider studying adding special purpose modules that better support the kinds of privilege separation policies being implemented by new e-mail transport agents, SSH daemons, and so forth.
Firewalls: Add firewall-style support to an operating system. You could try developing application-level gateways for several of the common protocols. You could also look at building an efficient router. Rather than just hacking away, you might also look at the software engineering aspect and look at some kind of firewall architecture that has inherent strengths against attacks on any of its protocols. As above, you might look at building low-level OS mechanisms that better support the design and implementation of existing firewall software.
Key management: MacOS X has a cool feature called the "KeyChain" which can store all your other cryptographic keys in a single box. Plan9 and Kerberos have similar features. Can you design a general-purpose OS mechanism for handling all these different forms of key storage? Maybe you could look at integrating this with smartcards for external key storage.

Peer-to-peer networking

Time synchronization: In many distributed systems and cryptographic protocols, the clock is used to gain a wide variety of valuable properties. However, globally synchronized clocks are not necessarily a given. There's a protocol called NTP (the network time protocol), which is widely used on the Internet, and it's even built into the latest Windows XP and MacOS X. While some newer extensions add cryptography to the system, it's still fundamentally a top-down system, where more trusted servers help set the clocks on less trusted ones. You could investigate how, with a p2p network where some nodes know the "correct" time (perhaps via GPS or atomic-clock radios), and other nodes are willing to lie about the time, that everybody in the network can end up, somehow, synchronized. Note that this synchronization can take several potential forms. For some applications, it's more important for two computers to agree on the precise length of a second. For other applications, it's more important for a set of computers to agree to their own version of the time, whether or not it's connected to the real world.
E-mail, instant messenger, etc.: Messaging applications are all the rage these days. You could investigate how to build decentralized chat systems (as in ICQ or AIM) or how to build e-mail systems. While you've got the "freedom" of starting from scratch, make sure you can address the inevitable spamming attacks.
Fair resource usage: Johnny Ngan is currently working on fair sharing of disk resources in a p2p system (i.e., guaranteeing that a user cannot consume more network storage than he or she is providing to the network). You can imagine versions of this problem that discuss network bandwidth usage. How can you guarantee that a user doesn't consume more network bandwidth than he or she provides? Or, turned the other way around, how can you guarantee that the p2p node in your dorm room, with some popular file on it, won't be overwhelmed with requests, even if all the other official replicas of the file are offline?
Read-write file storage: Several recent research trends have discussed the problems that occur with writing data to untrusted storage. Cryptographic filesystems aim to protect the confidentiality of a file, but they don't gain anything in availability against an adversary who can delete files. Other p2p systems have started to look at supporting writes, particularly with multiple, concurrent writers. There's plenty of work to do here, including deciding what the right file systems semantics should be, as observed by a user of the system. Should it "feel" like a normal filesystem, or if you relax that, are there performance and/or security benefits?
Traffic analysis: Even if the world starts encrypting its network traffic, you can learn a lot just by knowing who sent a message to whom. You could look at building traffic analysis systems, particularly in a p2p system, to see whether a small number of nodes can see enough routing traffic, or otherwise learn enough to get a good picture of what's happening inside the network. You could also investigate whether "standard" anti-censorship or anonymity-preserving systems can really hold up when some percentage of the nodes are colluding to figure out what's going on.

Cryptography / Networking

Extensions to cryptyc: You used cryptyc in your soda machine project. By now, you've probably got some ideas of how it could be improved. A good start would be extending the cryptyc implementation to support the public-key extensions described in their latest papers. Cryptyc also needs to have a decent GUI interface to show you a picture of the protocol you're studying and to give you a graphic understanding of how your system is or isn't working properly. (Talk to Scott Crosby if this interests you.)
Login authentication: Modern Unix systems support pluggable authentication modules (PAM). Write a PAM that uses a smart card or a PalmPilot or some other interesting technique.
High performance ciphers: How fast can you go? RC4 can theoretically be tuned to go the speed of memcpy(3C). What if you wire together gzip (or some other compression system) with an encryption system? Can you go faster with the two systems optimized together than running them separately? Compare the performance of several different ciphers (either with code you get from the net or from Schneier's book). Fine tune the inner loop at the raw assembly level if you have to. It would be interesting to look more closely at the new AES (Rijndael) cipher, particularly implementing it in some kind of silicon (e.g., Xilinx parts).
One-time pad management: Assume the worst case: all traditional cryptosystems have been broken, P=NP, and evil attackers are actively in control of every network. This means the only remaining cryptosystem is the one-time pad. How would you do the equivalent of digital signatures? How would you do the equivalent of public key infrastructure? How would you securely exchange pad bits? Work out the models on paper and write a program that implements them.
TCP/IP protocol stuff: Generic TCP/IP is vulnerable to all kinds of attacks. Design and implement other protocols that might be stronger, but still preserve the efficiency of TCP/IP. For example, an attacker can emit a RST packet which closes the connection. That's a denial-of-service attack. Can you build an efficient networking protocol that's resistant to this attack? What about session hijacking?

Language Security

Security through code rewriting: Many researchers are adding new security semantics to Java these days by rewriting Java bytecode. You can use this trick to force the program to observe some behaviors. In the case of Java, a number of nice libraries are available that make it relatively easy to manipulate Java bytecode. For something like Scheme, it's already pretty easy to manipulate the code. Pick a class of security policies you're interested in and investigate adding those semantics to Java, Scheme, or whatever.
Malicious code detection: Can you parse a program and statically detect if it will misbehave? Anti-virus programs usually use a long list of patterns that they try to match in the software. Build something similar.
Agent systems: In agent systems, multiple agents are running together in the same language runtime and interacting with each other. Example agent systems include MUDs (multi-user dungeons) and stock market trading systems. One possible project is working on resource management and security issues here (to control misbehaving agents). Another class of projects is to build an agent system that take advantage of somebody else's mechanisms.
Microsoft C#: Microsoft has a new language called C# that's part of Visual Studio.Net. For all intents and purposes, it's just Java except (of course) it's incompatible with Sun's system. The bytecode underneath C#, called the Common Language Runtime (CLR), is much more general than Java's bytecode: a C or C++ compiler can target it, so it supports general pointer manipulation and the works. It supports unsafe code, but it has a verifier. Possible projects include stress-testing Microsoft's verifier and looking for weaknesses in Microsoft's class libraries. There's fame and fortune in here for somebody who can blow Microsoft's security to smitherines. Another set of projects would be re-doing the work we've done on Java bytecode rewriting in the context of CLR.
Code obfuscation: You can mutate a program's control flow and it's dataflow. You can also write tools that may be able to reconstruct this information, even if it's been obfuscated. Build an obfuscation system and/or find some obfuscated code (perhaps buried in Windows XP or Office XP) and try to unobfuscate it. An interesting project, for example, might be to create a dataflow/control flow tool to study how Office XP detects changes in your hardware or otherwise tries to detect if it's been copied.
Static analysis: Many interesting systems in the past several years have used static analysis techniques to study C programs to look for security holes. You could pick up an existing tool, like CQUAL, or look at writing your own tools. You're more likely to get interesting results trying to apply these existing tools to detecting new kinds of security problems.
Dynamic analysis: Tools like Purify are designed to augment a program, before it starts running, to check for various buggy program behaviors (including buffer overflows and other common C pointer mishandling issues). You might be able to do something similar, whether through a compiler hack, an object-code rewriting hack, or a Java bytecode rewriting hack. The challenge is to collect enough information to be able to detect problems without requiring a huge memory or CPU time overhead.

PDA / Mobile computing platforms

Palm Pilots, cel phones, and other gizmos that have (a) CPUs, (b) screens, (c) user input, and (d) antennas are becoming increasingly ubiquitous as their prices and weight drop and their functionality increases. If they were truly everywhere, you could consider building systems around this.

E-cash systems: Beam money to your friends. Beam money to the Coke machine. Use cryptography to protect the data, of course, but you also need to worry about double-spending, non-repudiation, and all that. Your job gets easier because the phones are always on-line, but you should be able to have one of the parties be offline (e.g., the Coke machine), and have everything else still be secure.
Wireless Ethernet: Wireless Ethernet (a.k.a. 802.11b or Wi-Fi), now deployed at Rice in Duncan Hall and Fondren Library, makes it trivial to listen in on other people's conversations, and a number of hacker tools exist to make this trivial. In addition, it's also easy to "race" the local DNS server, the DHCP server, and so forth and give bogus responses that screw everybody up. At the Usenix Security Symposium in 2001, for example, one idiot was answering all DNS requests, and pointing them at his own laptop, where he had a Web server that fed out a lame Web page. Because it's meant to be a public Ethernet, you can't solve the problem by saying "only authorized users can connect." Last year, we had some students work on being able to localize themselves inside Duncan Hall (and they got two conference publications and a forthcoming journal article out of it). The next step is being able to localize somebody who doesn't want to be found.

Software Engineering for Security

Security lint: In the same way that you might detect malicious code, you can detect well-known bad programming practices. This might cover everything from C buffer overflow conditions to common misunderstandings of Unix system calls. Mudge (at L0pht.com) has a list of over 100 common bad practices. Build a C source analyzer that can detect these practices. (See "static analysis" above, as well.)
Fault injection: Software is pretty fragile stuff. If you deliberately break it a little bit, or if you deliberately feed it unusual input, you can often coerce software to fail. You could investigate building a general-purpose tool, or you could focus on specific classes of programs, such as network servers (i.e., e-mail, Web, Usenet).

Privacy

People often have secrets. While they're straightward to protect on computers (i.e., mark a file as readable only by its owner), things get much more difficult on the net.

Web privacy: Cookies can be used to track you. URLs with funny extensions can be used to track you. Web pages load images from third parties like DoubleClick. They're not just advertisements, they're tracking you as well. There are many opportunities to build projects here. You could build Web proxies that do anything from ad filtering, like WebWasher to ad jamming (feeding back bogus cookie information). You could also build systems where privacy emerges as a property of a lot of people surfing at the same time, such as Crowds. On the flip side, you could analyze systems like these and try to systematically break them. (In the hallway outside DH3004, you can see the results of a previous semester's attempt to graph cookie usage on the web. This work really needs to be redone, properly.)

Applications

Auditing infrastructure: The DIDS paper discussed a networked audit facility for intrusion detection. But what happens if the centralized audit machine is successfully attacked? Build a truly decentralized system where, even if some number of systems go down, you still have a complete record of what happened.
Spam filtering: Write a classification system that can distinguish spam from normal mail. I have several years of spam saved that you can use to train your system. Another related project would be writing a Web proxy server that filters out advertisements.
Chat systems: You're talking to somebody on the chat server today. Tomorrow, you get an e-mail from somebody claiming they're the same person. Build a way for people who don't know each other to be able to identify each other later with some kind of cryptographically strong authentication. A cool property of your system would be an ability for it to be deployed incrementally with existing chat systems, rather than requiring everybody to switch en-masse to something new.
Calendaring systems: I want to schedule a meeting with people from different companies. The computer should be able to look at my free time and the free time of everybody else and come up with the best time we can all meet. Can you do this without me being able to read the full calendars of everybody else? Can you do it without a central calendar server (i.e., using some cryptographic primitives to arrive at a time we can all meet without anyone revealing too much of their private schedule information)?

Other Fun Stuff

Smart cards: We've got some general-purpose smart cards. Design an application for them to replace some traditional function in the world.
Biometrics: Read Anderson, Chapter 13, then go implement your own biometric. Maybe start with something relatively easy like finger geometry recognition by putting your hand under a video camera. Get lots of people to try it and see how well you can do.
Copy protection: You can look at existing digital rights managment systems (those few that are available) and show how they can be broken. You might also look lower-level issues, such as how Macrovision works, or what it takes to get around the console protections on consumer video game boxes. Another related idea is to look at U.S. patent 6,018,374, which talks about using infra-red laser light to make a movie screen look washed out to a video camera. It would be interesting to see how well this actually works, and whether commercial IR filters could fix the problem.

Dan Wallach, CS Department, Rice University
Last modified: Thu 10-Oct-2002 16:15