Why AttendBack to Top

As in previous years (2016, 2017), CytoData brings together the community of researchers mining microscopy image data. This community has gained a lot of momentum over the last couple of years and shifted from proof-of-principle work to making biological discoveries. Collectively, we have gathered and continue to build on tremendous knowledge and insights into this challenging field!

We recently formed the CytoData Society (CytoDS) to build and maintain an active community around image-based profiling of biological phenotypes induced by genetic, chemical or other perturbations of biological systems. CytoData 2018 will be great venue to connect with this community, exchange ideas, and learn from experts in the field.

Gene networks derived from image-based profiles

Attendees will join together for a one-day symposium with seminars, lectures and social events followed by a two-day hackathon. We aspire to gather a diverse and engaged group of both image-based profiling experts, and members of related fields. Everyone involved will be an active contributor (even if remote), no passive absorption of others’ ideas allowed! We assume participants will be willing to share their analysis strategies, given that each laboratory’s “competitive advantage” in this field does not come from their data pre-processing pipeline but rather from computational techniques downstream of the steps covered in the event, and from biological discoveries derived from their unique data.

Fri Sep 21 Full day Symposium
Hear from leading researchers in the field from industry and academia.

Mon Sep 24 CytoData Hackathon - Day 1
Get oriented, and starting working on the first stage of the challenge.

Tue Sep 25 CytoData Hackathon - Day 2
Refine models using second stage of the challenge, then evaluate on held out set.
Winners will be announced based on the held out set.

SponsorsBack to Top

Gold sponsor

Perkin Elmer

Silver Sponsors

Recursion Pharmaceuticals

Janssen Pharmaceutica

Bronze Sponsors

Phenomic AI


Prizes and Hackathon Sponsors



Symposium ScheduleBack to Top

Registration and breakfast

8:00 - 9:00

Session 1

9:00 - 9:30
Fighting disease with image-based profiling. Anne Carpenter, Broad Institute.

9:30 - 10:00
New 3D Analysis Tools for High-Content Screening, Alexander Schreiner, Perkin Elmer.

10:00 - 10:30
A Next Generation Connectivity Map: L1000 Platform and the First 1,000,000 Profiles, Todd Golub, Broad Institute.

10:30 - 11:00
Coffee break

Session 2

11:00 - 11:30
Leveraging multiple data sources to prioritize drug combinations using Machine Learning formalisms, Arvind Rao, University of Michigan.

11:30 - 12:00
Manifold learning yields insight into complex high throughput biological data, Smita Krishnaswamy, Yale University.

12:00 - 1:00

Session 3

1:00 - 1:30
Deep Learning in Action: Automating HCS analysis with Deep Multiple Instance Learning, Sam Cooper, PhenomicAI.

1:30 - 2:00
Conditional Generative Models For Explaining Cell Variation, Greg Johnson, Allen Institute.

2:00 - 2:30
Batch Equalization, Mike Ando, Google AI.

2:30 - 3:00
Deep learning representations of cellular disease and rescue, Berton Earnshaw Recursion.

3:00 - 3:30
Coffee break

Session 4

3:30 - 4:00
Large-scale CRISPR screens: getting past viability, John Doench, Broad Institute.

4:00 - 4:30
Learning from high content images of tool compounds, Jeremy Jenkins, Novartis Institutes
for BioMedical Research.

Session 5 Highlight Talks

4:30 - 4:45 Adam Arany, KU Leuven.

4:45 - 5:00 Rebecca Hughes, Univ. of Edinburgh.

5:00 - 5:15 Florian Heigwer, DKFZ.

5:15 - 5:30 Steven Vermeulen, MERLN-cBITE.

Closing remarks

5:30 - 5:45

Social event sponsored by PerkinElmer

6:00 - 9:00 CytoData Post-Symposium appetizer hour at Meadhall

Hackathon ScheduleBack to Top

The hackathon will be held in the Yellowstone room on the 2nd floor of the Broad building at 415 Main St, Cambridge, 02142. You don’t need Broad access to get to the 2nd floor. Let the security know you are attending CytoData. If you are stuck, connect to the Broad wifi (it is open) and Slack us :)

Day 1 (Monday Sep 23)

8:30 - 9:00

9:00 - 10:00
Understand the challenge, datasets, and the metric.

10:00 - 11:00
Hands-on session

11:00 - 17:00
Hackathon session

12:00 - 13:00
Lunch will be served

Head out for dinner – details to follow.

Day 2 (Tuesday Sep 24)

8:30 - 9:00

9:00 - 10:00
Familiarize yourselves with the new dataset

10:00 - 16:00
Hackathon session

12:00 - 13:00
Lunch will be served

16:00 - 16:30
Final submission

16:30 - 17:15
Team presentations and winner announcement

Challenge format

During the challenge: Sep 24 and 25, 2018

The challenge will start with a description of the problem and a presentation of the datasets available for research. A summary of methodologies proposed before the challenge will be presented too. Teams will be formed after preliminary discussions and Q&A sessions. The data and other resources will be released when the teams are organized to start hacking.

Amazon Web Services virtual machines available for participants to run experiments. Also, example code to load data, run baseline experiments and evaluate performance will be provided to get partipants up to speed.

At the end of the two-day challenge, a holdout set will be released to participants to match image-based signatures in the sets provided before. This will be evaluated automatically to identify the team that can match signature across datasets more accurately than the baselines and other teams.

The winning team will take a home a prize sponsored by Nvidia.

After the challenge: From Sep 26, 2018 onwards

All the data generated during the challenge will be potentially useful to compare and analyze different strategies for solving the cross dataset matching problem. Interested participants can refine their approaches after the competition and contribute with their results in a collaborative paper written jointly with the challenge organizers and other participants.

LogisticsBack to Top


Each participant should plan to finance their own travel and lodging (exception: some support can be provided for attendees from under-represented groups; please contact us at cytodata.info@gmail.com).


The closest airport is Boston Logan Airport (BOS). The Broad Institute is accessible by the MBTA red line at the Kendall/MIT stop.

Childcare Options

Hello Chime (by Sittercity): You can book online directly with babysitters; best for weekday (evening only) and the weekend.

There is also the possibility of on-site child care during the symposium if attendees would like it, but we need advance notice (min 3 weeks before event). If you’re interested in this service, or have other questions regarding child care or if childcare expenses are a financial concern, please reach out to imagingadmin@broadinstitute.org.

OrganizersBack to Top

Conference Committee

General Chair: Tim Becker
Administrator: Jeanelle Ackerman

CytoData Society Board

Past president: Anne Carpenter
President: Shantanu Singh
Vice President: Florian Heigwer
Event officer: Sam Cooper
Communications officer: Alex Vasilevich
Industry liaison: Alex Vasilevich
Operations officer: Rabia Khan
Education/ outreach officer: Juan Caicedo
Treasurer: Lassi Paavolainen
Society liaison: Oren Kraus
Resource officer: Lassi Paavolainen


The CytoData Society (CytoDS) is an international society that builds and maintains an active community around image-based profiling of biological phenotypes induced by genetic, chemical or other perturbations of biological systems. This field is still in its infancy, and its adoption has been slow, primarily due to the data analysis challenges posed, the large imaging experiments required, and the lack of ground-truth data sets where relationships among images are known. Our goal is therefore to bring image-based profiling into the mainstream by organizing the community working on this technology.

HostsBack to Top

Carpenter Lab

The Carpenter Lab / Imaging Platform is based at the Broad Institute of MIT and Harvard in Cambridge, Massachusetts, USA. Our research group develops advanced methods and software tools to quantify and mine the rich information present in cellular images to yield biological discoveries. Our laboratory is best known for our open source software packages CellProfiler and CellProfiler Analyst, but a major focus these days is profiling: we are using advanced machine learning methods to identify morphological patterns in cell populations, to probe the causes and cures for disease.