{Health}Re·gen
/ helθ riːdʒɛn / noun
neural network that assesses an
individual’s predisposition to cancer
Andra-Gabriela Cîrstoiu
Founder & Developer
{Health}Re·gen Platform

Oncologic risk prediction platform based on neural networks and Bayesian models

An advanced oncologic predisposition assessment system that integrates clinical data, exposure to risk factors, and family history for a more personalized and preventive approach.

100,000

NIH Records

67.94%

Model Accuracy

±13.02%

Margin of Error

The estimated ±13.02% margin of error reflects clinical bias, historical misdiagnoses, label noise, and environmental variability across patient profiles.

Global Context

According to the World Health Organization, approximately 40% of the population will face a cancer diagnosis at least once during their lifetime.

Cancer arises when mutations disable DNA-repair and growth-control genes, leading to unchecked cell proliferation and malignant tumor development.

Disease Genetics

Unlike monogenic disorders, most forms of cancer are polygenic and multifactorial.

Accurate prediction therefore requires a broader analysis of genetic context interacting with environmental exposure, stress, lifestyle, and triggering conditions.

Methodology & Workflow

HealthRegen is designed as a calibrated preventive pipeline: it learns from large clinical datasets, detects early statistical signals of predisposition, requests missing data when needed, and improves through validated outcomes.

1

Training on Large-Scale Data

The model was trained on 100,000 anonymized records from the National Institutes of Health (NIH), covering patients aged 21 to 65 and balanced across sex and race.

2

Pattern Detection

The neural network detects macro- and micro-patterns that may signal cancer predisposition, identifying statistically relevant neighboring features and temporal trends across datasets.

3

Patient Input & Risk Computation

The system receives patient records, timing, exposures, lifestyle, and family history, then computes a calibrated probabilistic risk estimate instead of a rigid binary answer.

4

Missing Data Recovery

If the profile is incomplete, the platform requests additional fields and redirects the process toward the next relevant medical checks instead of forcing an unreliable assessment.

5

Validation & Model Update

When real outcomes become available, the system learns from confirmed results, analyzes bias and label noise, and updates the model to improve calibration and stability.

Observed Results

50k

Patients in the Test Cohort

67.94%

Accuracy in Predicting Cancer Predisposition

≥7%

Potential Improvement from Raw Digitized Records

On a 50,000-patient test cohort, the model reached 67.94% accuracy in predicting cancer predisposition. The estimated ±13.02% margin reflects clinical bias, misdiagnoses, and environmental variation. Integrating raw digitized medical records would likely reduce this error and improve calibration and stability by at least 7%.

Clinical Integration

The platform is designed to support medical decision-making, not replace it. In practice, it can:

  • Request missing fields in a patient profile.
  • Flag possible predispositions that justify immediate screening.
  • Guide clinicians toward the next relevant medical checks.
  • Provide feedback on modifiable risk factors such as stress, lifestyle, and environmental exposure.

Limitations & Ethics

Prediction is not diagnosis. Output quality depends directly on the quality, completeness, and representativeness of the input data.

Models must be continuously monitored for demographic bias, label noise, dataset imbalance, and environmental mismatch between training populations and real-world patients.

Model Performance Over Time

The evolution of {Health}Re·gen model accuracy, together with upper and lower uncertainty bounds.

HealthRegen’s accuracy through time Accuracy (%) Upper bound (acc + margin) (%) Lower bound (acc - margin) (%)

Programs that the project was developed through

DEAN'S LIST FINALIST 2024
FIRST Tech Challenge World Championship, April 17–20, 2024
Girls Who Code
Data Science, AI, Machine Learning, Cybersecurity & Cryptography
July 1 – August 9, 2024
Yale Young Global Scholars
Innovation in Science and Technology
June 22 – July 4, 2025
Innovation in Programming & Robotics Scholarship
Invited to pitch by ING Global Board (Sep 11, 2025)
ING Hubs Invention Fair (Sep 24, 2025)
ReCoNnect Consortium Innovation Prize (Sep 25, 2025)
Accepted to Yale University
Bioengineering & Computer Science
$100,000/year Scholarship

Conclusion

Calibrated cancer-risk prediction based on routine medical data is a safer and more actionable approach than irreversible genetic intervention for diffuse, context-dependent risk scenarios.

Read the Full Research Context