I competed in the CalHacks 6.0 Hackathon this March. The topic was about finding a way to spread information about coronavirus in a more efficient, streamline manner. Thinking back to when this pandemic first became big, I remembered I wanted to create an algorithm to detect COVID-19 more quickly. Unsure about how to go about this, I did some research. The way COVID-19 was first detected was through X-ray images of a patient's lungs. While this process has changed since the pandemic first started, I believed it was the most credible way, since it is more visible.
An issue I ran into during the hackathon was my training model was not large enough to accurately predict. I was getting a 87.5% accuracy, which, for medical technology, is rather low. I used a Kaggle dataset that gave me 25 images of COVID lungs and 25 images of non-COVID lungs. I initially wanted to use unsupervised ML for prediction by first training the model to be able to tell the difference between the two lung sets. After I was able to detect which lungs are COVID lungs and which ones are not, the next issue was training it to know the difference. See, at this point I was able to tell there is a difference between the two datasets, but I was not able to tell which one is which.
At this point, I was running low on time, so I decided to train a model using supervised machine learning, since I had the labels for the datasets from Kaggle. This took a lot less time and I was able to finish with enough time to make a UI for the user to upload an X-Ray image of their lungs and my model will be able to tell if they are COVID lungs or non-COVID lungs with 92.5% accuracy.
Overall, the CalHacks Hackathon was extremely fun and I look forward to participating in it next year with a team at UIUC.