Breast Cancer Classification Challenge

Breast Cancer Classification Challenge

About

This hybrid challenge encourages students to apply their knowledge of biological sciences and computational skills to develop a robust machine learning model capable of distingusing between malignant and benign breast cancer. By working on this real-world application of computational biology, students will gain hands-on experience in data preprocessing, model training, and evaluation.

Date: March 29-30 Start Time: 1:00 PM Start Location: 203 Lawrence

Join the Challenge Discord -> https://discord.gg/HXuCKwK4

Complete the Pre-Test Survey

Prizes

First Place: $100

Second Place: $75

Third Place: $50

Project Information

Objective

  • Breast cancer is the most common cancer among women globally, representing 25% of all cancer cases. Early detection is critical, as it improves treatment outcomes and survival rates. A key challenge in detection is accurately classifying tumors as malignant (cancerous) or benign (non-cancerous). Your objective is to use python to develop and evaluate a supervised machine learning model that classifies tumors.
  • Models must be trained using standard machine learning techniques

Guidline

Dataset & Data Processing

Model Development

  • Implement at least one machine learning model(e.g., logistic regression, decision trees, Random Forest, SVM).
  • Tune hyperparameters for optimal performance.

Performance Evaluation

Model Report

Create a report discussing the following:

  • Objective of the project
  • Methods of data preprocessing
    • Include the exploritory data analysis
  • Model Selection
    • Discuss the machine learning models tested
    • Justify why you chose your model
  • Results & Performance Evaluation
    • Present key performance metrics
      • e.g., accuracy, precision, recall, F1-score, ROC-AUC
    • Show a confusion matrix and discuss errors/limitations in the model.
  • Conclusion
    • Summarize the main findings
    • Discuss limitations
Include visuals(Graphs, etc.) in the report.

Evaluation Criteria

  • Innovation & Impact (20%)
  • Technical Implementation (30%)
  • Performance Metrics & Accuracy (20%)
  • Quality of Report (20%)
  • Code Documentation (10%)

Additional Resources

Submission

Final Submissions are due March 30 at 2:00 PM

In order to submit your project, create a public GitHub repository containing all of your team’s code, the report, and all visuals. Once your repository is completed, complete the submission form to submit your project. The results will be sent out next Saturday(4/5).

In addition, we would appreciate if you took a few minutes to reflect on your experience, and filled out the post-challenge survey. Your responses will help us improve future events.

Mentorship

There will be undergraduate student helpers acting as mentors to help answer questions and guide participants along the process.

TimeLocation
1:30 PM- 3:00 PM203 Lawrence
3:00 PM - 5:00 PMDiscord

Rules and Regulations

  • Team Size: There may be up to two participants on a single team working together. Collaboration is strongly encouraged, however, solo participation is allowed.
  • Generative AI: Use of generative AI such as ChatGPT, Claude, Gemini, etc., is allowed, however, it must be cited in the final report.
  • Original Work: All work must be created during the challenge. Pre-existing projects or code are not allowed.
  • Intellectual Property: Teams retain rights to their work but agree to allow the organization to potentially showcase their project for future promotional purposes.
    • If you prefer your project not be showcased, please send an email to nel36@pitt.edu
Last updated on