The Canadian Centre for Cyber Security is proud to host GeekWeek 7, taking place Oct. 22 to 30, 2020, as a fully virtual event.
GeekWeek is an annual workshop organized by the Cyber Centre to foster collaboration between the Government of Canada, critical infrastructure partners and academic researchers to address vital problems facing the cyber security industry. Participants are given time and resources to conduct intensive research and development, and then devise and implement their own solutions to prevent, analyze or mitigate cyber threats.
What makes this workshop more than a typical hackathon is a selection of resources not available anywhere else. From advanced tools to millions of spam email samples, GeekWeek enables participants from every part of the cyber security spectrum to collaborate in new ways and improve the landscape.
Note that because of the technical nature of the workshop, GeekWeek is an invitation-only event.
GeekWeek is an annual workshop organized by the Canadian Centre for Cyber Security (Cyber Centre) to bring together key players in the field of cyber security to generate solutions to vital problems facing the industry.
GeekWeek offers a unique opportunity for participants to focus time and resources to conduct intensive research and development in the prevention, analysis and mitigation of cyber threats. The event enables representatives from Critical Incident Response Teams (CIRTs), Critical Infrastructure Partners (Government, Finance, Health, etc.), academia, and international cyber security partners to collaborate in new ways and improve the overall cyber security landscape.
GeekWeek has been held annually since 2014 and has delivered over 80 projects that have contributed to major innovations in the fields of malware analysis and detection, cyber health, network traffic and log analysis, to name a few.
GeekWeek started when the Canadian Cyber Incident Response Centre (now part of the Cyber Centre) identified a need to foster stronger collaboration within the cyber security community. The solutions oriented workshop was created to be an environment where participants work together on projects that drive innovation in cyber security.
With very modest aspirations, the first edition captured the attention and interest of both federal and industry organizations with triple the expected number of attendees. Since then, the event has grown from a 3-day workshop to an 8-day event. The 2019 event welcomed 222 participants who collectively worked 18,000 hours researching, developing and implementing new and innovative ideas. That’s the equivalent of 9 employees working full-time for a year!
GeekWeek has produced innovations and advances in the following areas:
- Malware detection
- Spam and log analysis
- Mobile malware analysis systems
- Tools and techniques to detect cyber threats
- Information sharing technologies and standards
- Cyber sovereignty/geographic data flows
- Cyber health and forecasts
- Botnet traffic analysis
- Hardening of IoT devices
- Industrial control systems assessment
- Fly-away kit/laptops
- Enforcement processes
- Automated malware analysis
Why Attend GeekWeek?
This year, GeekWeek will be virtual. Participants attending GeekWeek benefit from:
- A unique environment of knowledgeable cyber experts to address concerns and challenges within the community.
- Access to advanced tools, and millions of samples of spam emails, malware, and analysis reports.
- Increased awareness of resources and solutions currently available to cyber security professionals.
- Moving forward new ideas, innovative solutions and information sharing.
- Increased collaboration and information sharing between partners.
- Networking with cyber security community of experts and development of professional relationships.
- Better understanding of challenges faced by peers in the industry.
GeekWeek brings together key players in the field of cyber security to generate solutions to vital problems facing the entire community in an innovative and collaborative format. Participants have the opportunity to pick which area they’d like to focus on, based on the themes and sub-themes listed below. Teams will then be formed according to these preferences. Participants are also encouraged to suggest projects associated to each of the themes and sub-themes listed below. We value the community’s expertise and encourage participants to share their ideas with us.
If you have a specific area of interest or theme you would like to work on, please indicate the number associated with it, as per the list below, in your application or once you receive your acceptance email. Each sub-theme will have an area of cybersecurity associated to it, but you should keep in mind that most themes will still touch several areas, giving you exposure to a multitude of subjects. For example, as this is a cybersecurity workshop, a theme focused on programming will also touch on cyber data analysis activities.
Areas listed in the sub-themes:
- [N] Network & Infrastructure
- [P] Programming & Advanced Algorithm
- [R] Reverse Engineering
- [O] Operating Systems Internals
- [D] Cyber Data Analysis and Visualization & Data Mining
1. National Domain Name System (DNS) [D]
Canada has now a national DNS operated by the Canadian Internet Registration Authority (CIRA). Every day, a large number of domains are blocked to protect CIRA Canadian Shield users online. This is only the first step. How could we develop better analytics and information-sharing processes to improve Canadian Shield to even better protect industry and government partners, as well as private citizens, from unintentionally accessing these malicious domains?
2. Cyber Security Posture [D | N]
Well-crafted emails using identity spoofing and social engineering are key for successful phishing attempts. Available Domain-based Message Authentication, Reporting and Conformance (DMARC) aggregate reports use Domain Keys Identified Mail (DKIM) and Sender Policy Framework (SPF) to confirm the source and legitimacy of emails. Outdated systems and protocols are usually more vulnerable to cyber threats. For example, cyber threat actors may exploit an old version of the SSL protocol to conduct a Man-In-the-Middle attack. Using Canada as an example, how could we assess the cyber posture of the Canadian cyber space to cyber best practices such as DMARC, SPF, HTTP vs. HTTPS, SPF, DMARC, DNSSEC, TLS, etc.? How could we develop scorecards to help Cyber Centre industry partners and their adoption of modern and more secure technologies?
3. Cyber Sovereignty [D]
One would expect that communications between Canadian endpoints do not need to be routed outside of the country. However, for efficiency purposes or other practical reasons, this is not always the case. That can pose security risks. The Border Gateway Protocol (BGP) manages from source to destination the routes that Internet communications follow. By manipulating BGP, cyber attackers can reroute data in their favour and intercept or modify traffic. How can we develop analytics to detect malicious behaviours and changes in traffic routes in real time? How can we prevent traffic hijacking to help protect Canadian communications?
4. Preparing for a Wireless Future [D]
From cellphones to home appliances, an increasing number of devices are wirelessly connected to the Internet. As the use of 5G becomes widespread, we can expect this trend to continue at full speed. However, as wireless connectivity increases, so too does a growing number of cybersecurity threats targeting cellphones., Examples include as SIM swapping and threats from other wirelessly connected devices. Telecommunication data can offer invaluable insight on the threats targeting mobile devices. How could we parse and analyze wireless network data, such as Signaling System No. 7 (SS7) data, to gain a better understanding of mobility threats, especially ones related to 4G/LTE or 5G? How could we build processes and systems to better prevent wireless threats and better protect Canadians?
5. Detecting and Decoding Advanced Persistent Threat (APT) Malware [R | P]
Malicious APT artifacts are usually stealthy, well-crafted and possess strong anti-analysis and evasion techniques. These attributes make it difficult to detect and decode them. Using knowledge gained from in-depth reverse engineering of a chosen APT malicious sample, how could we build tools to detect malicious activity from these samples in local environments? Also, how could we develop the ability to evaluate IPv4 spaces for the presence of APT implants?
6. Honeypots/Improving Internet Emulator [N | P | D]
Malicious actors are always finding innovative ways to infect systems. As a result, their cyber threats are always evolving. One solution to countering this problem is the deployment of honeypots. Honeypots are systems that mimic real environments to fool threat actors and gather intelligence on emerging infection vectors. To be effective, honeypots need to keep up with the latest infection techniques. Moreover, an Internet emulator for dynamic malware analysis is also a type of honeypot. It is not always possible to let malware communications use the Internet when performing dynamic analysis. Although Inetsim is an Internet emulator that could be used to address this problem, it was developed more than 10 years ago and does not leverage the capabilities of the latest technologies. How could we develop a new and better Internet simulation for isolated dynamic malware analysis tools (not only simulating services, but also content)? How could we also simulate Industrial Control Systems and user interactions? Could we build a system that functions as both Internet emulator and honeypot? How could we leverage new concepts like machine learning and the cloud, or improve the honeypot/Internet emulator response rates to collect sophisticated malicious artifacts and emerging threats?
7. SPAM/Phishing/Smishing [D | P]
Analyzing SPAM emails sent from botnets, phishing, and smishing URLs allow us to manually extract Indicators of Compromise (IOC). How could we develop techniques to find and extract relevant and actionable information automatically from billions of SPAM emails and recently created domains in real time? Furthermore, how could we develop analytics to identify SPAM/phishing/smishing campaigns
8. Memory Analysis [O | P]
When we examine configurations embedded in malware artifacts, we can find a wealth of actionable information that can be leveraged in further analyses. For example, the configuration of many malware families contains the addresses of their Command & Control servers. How could we leverage the configuration information that is available at runtime in order to derive valuable forensics data about the state of the system? How could this process be structured in a framework to permit automation and scaling (e.g., using tools like CAPE or malscan)?
9. Cyber Neighbourhood Watch [P]
It is not always easy to accurately validate the maliciousness of an indicator and evaluate the possible negative side effects of blocking its associated traffic. During a previous Cyber Centre event, an initiative was developed in collaboration with Canadian telecommunication companies to create a method in which different organizations can communicate useful information regarding network traffic to other partner organizations. This work was done by using MISP. Could we democratize the determination of an indicator’s quality and relevancy through a member voting system? How could we create a system to support such a community? Furthermore, how could this knowledge be used to enrich a national DNS?
10. Malicious Infrastructure and Threat Hunting [P | R]
There are many cyber threats in the wild, making it sometimes difficult to automatically recognize malicious infrastructure and phishing attempts from the large amounts of data. One method of detection is to switch the path section of a URL with another path known to be associated with malicious infrastructure. How could we distill data to identify recipes we can use to find rules that will, in turn, find more data and/or add value to the existing data? How could we validate that our recipes for rules are correct? How could we automatically recognize and attribute gathered malicious data and infrastructure?
11. Cloud Monitoring and Analytics [D]
As more and more government services move to the Microsoft Azure cloud, it becomes essential for the Cyber Centre to understand how users can secure and monitor their cloud tenancies and associated physical infrastructure. Microsoft Azure provides logs for a client to monitor their cloud activities and resources. How could we create a framework to extract, analyze and visualize cloud logs into actionable information? How could we derive a set of good practices for Microsoft Azure cloud users and work with cloud providers to improve their services?
12. Infrastructure Mapping [R | D]
In order to properly respond to malicious actors, it is necessary to start by understanding the malicious infrastructure they leverage. How could we pool IOCs from industry, government and law enforcement agencies and enrich them to their fullest degree to the architecture of the malicious infrastructure?
13. Operationalize Hunting Malicious Sample [R | D]
Operational hunting is different from typical IOC hunting. It specifically looks for actionable intelligence that can be used to build a case against the malicious actor behind a sample. How could we effectively hunt samples (through host analysis, surface analysis, reverse engineering, etc.) to determine the internal composition of a cyber threat?
14. Actor Attribution [R | D]
In order to take action against malicious actors, we need to build a strong case with backing admissible evidence clearly identifying the individual or individuals responsible for the cyber threat. How could we identify specific individuals behind the attacks by researching IOC and actor monikers?
15. Cross Organization Data Harvesting and Analytics [D]
One of the biggest issues behind cross-organizational collaborative hunting is attempting to find a mutual method for sharing, storing, normalizing and visualizing the data. How could we ingest vast amounts of data on malicious infrastructure, malware and actor attribution? How could we leverage the activities done by other GeekWeek teams to produce actionable output?
16. Follow the Money [P | D]
Cryptocurrencies are often used as the method of choice by threat actors to receive payments from their victims. How could we mine the blockchain and gather key information about this ecosystem? How could we use this data to trace back the actors or group cyber threats together? How could we find a repeatable and sustainable way to measure the costs and revenues of cyber threat actors, as well as the cyber threat economy?
17. Contributing to Community Tools [P]
Open source tools and services built by the cyber community play an important role in making cyber security more accessible. These tools strengthen our response capacity against cyber threats. For instance, how could we work together to improve community services and tools, such as MISP, Malpedia or Ghidra, and connect more people to these services? How could we contribute to open source tools with code that could benefit the entire cyber community?
18. Advanced Genetic Malware Analysis [R | P |D]
“Information retrieval” is an emerging malware analysis and reverse engineering technique that decomposes new unknown malware into existing known components and wheels from the existing data. The existing tools, such as Intezer, rely on exact code matching algorithms and cannot retrieve information that slightly differs from the original code. How could we develop an information retrieval system able to perform inexact matching that is also able to adapt to different CPU architectures? How could we leverage existing tools, such as Kam1n0 and Ghidra, to design a system both flexible and scalable?
19. Advanced Malware Clustering [P]
New malware is discovered everyday, but a majority of discoveries are variants of existing malware. How can we automatically cluster those variants based on their behaviour and then group them with their affiliated family? How can we extract shareable IOCs and signatures from a cluster of similar malware?
20. Advanced Malicious Infrastructure Clustering [P]
Malicious infrastructure changes constantly and makes significant efforts to hide its affiliation. Similarly, new phishing websites that impersonate companies are created every day, always containing minor differences to evade automated detection. How could we use machine learning and image recognition algorithms to automatically cluster and attribute malicious infrastructure and websites?
21. Validation Cyber Threat Infrastructure [P]
As malicious actors change and relocate their infrastructure to avoid detection, we need to constantly verify that our data is still valid. How could we build a system that regularly browses the Internet to validate the information on malicious infrastructure and ensure it is still active?
22. Automated Knowledge Extraction (a.k.a It is all about graphs!) [P]
A cyber threat story usually starts with only a handful of indicators. The analyst then manually creates relationships with other information sources to complete the story and discern the full picture. How could we automate this process? How could we automatically draw a graph of indicators that shows the analyst the entire story without manual processing?
23. Retro-Hunting [R | P]
Even for the most seasoned reverse-engineer, drafting the perfect YARA rule for a malicious sample is a challenge. Retro-hunting allows analysts to assess the quality of their life signature at scale. Tools like BigGrep can be used to search for malicious samples in the past that would have triggered a given signature by indexing a large quantity of binary files. How could we optimize the speed, size, and results of retro-hunting tools to improve our cyber threat intelligence capabilities?
24. Malware Intake [P]
When looking for help on malicious cyber activity, it is hard to pinpoint the appropriate entity to escalate to. A single point of contact would make cyber event reporting more approachable and user-friendly. How could we develop an automated, centralized system that would allow all incoming reports of malware, spam and phishing to be sorted by point of origin and dispatched automatically to the proper addressing authority?
25. Advanced Sandboxes [P |O]
Virtual environments are well equipped to analyze malware on popular operating systems (e.g. Windows). However, malicious actors are always improving their anti-sandbox detection techniques, making it more challenging to conduct analyses in virtual environments. In addition, the amount of malware targeting other operating systems is growing. As a result, sandboxes need to be adapted to those new types of samples. How could we improve existing sandboxes tools to avoid detection and gather additional IOCs? How would we virtualize several Internet of Things (IoT) devices for malware analysis? For this project, we will be using Docker.
Frequently Asked Questions (FAQs)
Q1. Is there a fee to attend GeekWeek?
No, there is no registration fee for GeekWeek.
Q2. What is the target audience for participants?
GeekWeek is targeted towards cyber security professionals, computer experts, and big data or computer enthusiasts who are interested in finding solutions to the cyber security industry’s biggest problems.
Q3. What can I expect during GeekWeek?
The exact itinerary will be shared closer to the event, but expect GeekWeek to be organized as follows:
- The first day: Virtual registration and system set-up, meet your captain and teammates, and formal GeekWeek kick-off.
- Throughout the event:
- During the days: Work on your assigned project with your team.
- During the evenings: Participate in organized social and networking activities with fellow GeekWeek participants.
- The final day: Final presentations, where participants will get to hear and learn what their fellow attendees have been working on.
Q4. What do I need to bring to GeekWeek?
GeekWeek is a BYOD (Bring Your Own Device) event. Participants will need their own laptops, where they will pre-install all needed tools and store all data that will be required for their project. During the event, participants will be connected to the Internet and the GeekWeek cloud, giving them access to shared resources and datasets.
If you require something (e.g. tool, data) that is essential to your project but not available to you, please mention it in your application.
Q5. Can I come to only attend the presentations on the final day?
GeekWeek participants and invited cyber security executives from government and industry will be the sole attendees for the final presentations.
Q6. Can I submit my own project idea?
Yes, applicants can submit their own project ideas along with their applications. We want to hear ideas and concerns directly from the cyber security community, and then work together to tackle these problems. But if you don’t have a specific project idea in mind, GeekWeek has a list of proposed projects and themes for participants to choose from.
The following organizations are just a few who will be participating in GeekWeek 7. The Cyber Centre would like to thank all organizations attending this year. Cyber security is a team sport, and this type of event could not be possible without the help of all participants.
To learn more about GeekWeek, please contact the Cyber Centre.