October 3, 2016 – The development and maturation of automated data tools for cancer research, part of the objectives outlined in the White House’s Cancer Moonshot initiative, could give medical researchers and policymakers an unprecedented view of the U.S. cancer population—at a level of detail typically only obtained for clinical trial patients (less than 5 percent of the overall cancer population). Using the Titan supercomputer, a team led by Oak Ridge National Laboratory’s Georgia Tourassi is making progress towards this goal by employing deep learning techniques to extract useful information from text-based cancer pathology reports. So far, the team has established deep learning’s advantages in multi-task learning, using nearly 2,000 cancer pathology reports to train a neural network to identify a cancer’s primary site and laterality from text. In another study, Tourassi’s team deployed deep learning to match the cancer’s origin to a corresponding topological code, a more specific classification than primary site or laterality. The promising performance trends measured in these early studies will guide the team as they scale up deep learning to tackle larger datasets and move towards less human supervision.
Menu