NASA’s Geospatial Foundation Model
Key Points
- Foundation models are large‑scale neural networks pretrained on massive datasets that can transfer learned knowledge to new tasks through fine‑tuning with relatively few labeled examples.
- NASA archives roughly 70 PB of Earth‑science satellite imagery (projected to hit ~300 PB by 2030), providing an unparalleled reservoir of data for climate‑related research.
- In partnership with IBM, NASA released the open‑source “IBM NASA Geospatial” foundation model on Hugging Face, which leverages transformer architecture to compress raw satellite images into useful representations for many downstream tasks.
- By extracting structure from raw imagery, the geospatial foundation model dramatically reduces the need for time‑intensive human annotation, speeding up analysis of crops, forests, and other land‑cover features.
Sections
- NASA Data Fuels Foundation Models - The speaker defines foundation models, highlights the thousands of open‑source versions on Huggingface, and proposes NASA’s massive Earth‑science dataset as a valuable resource for training and fine‑tuning these models.
- Foundation Models Transform Satellite Analysis - The speaker explains how foundation models automate satellite image labeling, enable fine‑tuned flood and wildfire mapping, and can be repurposed for tasks like deforestation tracking, crop‑yield prediction, and greenhouse‑gas monitoring, dramatically expanding the utility of NASA Earth‑science data.
Full Transcript
# NASA’s Geospatial Foundation Model **Source:** [https://www.youtube.com/watch?v=QPQy7jUpmyA](https://www.youtube.com/watch?v=QPQy7jUpmyA) **Duration:** 00:05:13 ## Summary - Foundation models are large‑scale neural networks pretrained on massive datasets that can transfer learned knowledge to new tasks through fine‑tuning with relatively few labeled examples. - NASA archives roughly 70 PB of Earth‑science satellite imagery (projected to hit ~300 PB by 2030), providing an unparalleled reservoir of data for climate‑related research. - In partnership with IBM, NASA released the open‑source “IBM NASA Geospatial” foundation model on Hugging Face, which leverages transformer architecture to compress raw satellite images into useful representations for many downstream tasks. - By extracting structure from raw imagery, the geospatial foundation model dramatically reduces the need for time‑intensive human annotation, speeding up analysis of crops, forests, and other land‑cover features. ## Sections - [00:00:00](https://www.youtube.com/watch?v=QPQy7jUpmyA&t=0s) **NASA Data Fuels Foundation Models** - The speaker defines foundation models, highlights the thousands of open‑source versions on Huggingface, and proposes NASA’s massive Earth‑science dataset as a valuable resource for training and fine‑tuning these models. - [00:03:08](https://www.youtube.com/watch?v=QPQy7jUpmyA&t=188s) **Foundation Models Transform Satellite Analysis** - The speaker explains how foundation models automate satellite image labeling, enable fine‑tuned flood and wildfire mapping, and can be repurposed for tasks like deforestation tracking, crop‑yield prediction, and greenhouse‑gas monitoring, dramatically expanding the utility of NASA Earth‑science data. ## Full Transcript
If you head over to Huggingface, you will find literally thousands of foundation models available for download.
And that's just the open source ones.
So this does beg the question, why are there so many foundation models?
Well, to help answer that, we're going to look to NASA.
But first, we should probably define what a foundation model actually is.
And look, I have a whole video on that topic.
So for now, let's just say that foundation models, which is what this represents here,
are large scale neural networks trained on vast amounts of data,
and they serve as a base or a "foundation" for a multitude of applications.
And a foundation model can apply information.
It's learned about one situation to a different situation it was not trained on. And we call that transfer learning.
Pre-trained a foundation model, and you can teach it an entirely new task with a limited set of hand-labeled examples.
So if we pick a foundation model that has ingested the right data and we provide the right fine tuning,
we can put it to work in our own specific applications.
Which brings us to NASA.
If you're looking for huge amounts of data, well, look no further than NASA.
Today we are sitting on about 70 petabytes of earth science data captured from satellite images,
which which sounds like a lot,
but by 2030, with the launch of a dozen or so new space missions,
that number is expected to be closer to 300 petabytes of data.
So we have a vast, vast amount of data.
and we may be able to use that to provide insights to, well, all sorts of climate-related discoveries.
But how can we possibly utilize it?
Well, through a foundation model, of course!
Now, for the last six months, NASA has been working with IBM to create an AI foundation model for Earth observations.
And now you and, well, anybody who wants it can download the whole thing.
It's called the "IBM NASA Geospatial model,"
and it's an open source model available on Huggingface.
And this geospatial foundation model really does help us to answer the question of why are there so many foundation models?
Look, underpinning all foundation models is the concept of a transformer.
That's an AI architecture that can turn heaps of raw data, be that text or audio, or in this case,
satellite images into a compressed representation that captures the data's basic structure. That represents this.
And then we can use this with a foundation model for a wide variety of tasks with some extra label data and tuning.
Now, look,
traditionally, analyzing satellite data like this has been a tedious process
because of the time required for human experts to annotate features.
So in each satellite image we label-- let's see this group of pixels, that's crops
--and then we would say this group of pixels.
Yeah, that's trees and so forth.
And a human is having to go through this takes a lot of time.
So foundation models can cut out a lot of this manual effort by extracting the structure of raw natural images
so that fewer label examples are needed.
Then the foundation model has been fine tuned to allow users to map the extent of past US floods and wildfires.
Why do that?
Because these measurements then can be used to predict future areas of risk.
So we have a flood and wildfire prediction model.
Pretty cool.
But look, foundation models are well, foundational.
We can take that model and apply our own fine tuning to build upon the model to perform different tasks entirely.
So with additional fine tuning, our flood and wildfire prediction model
can be redeployed for tasks like tracking deforestation or predicting crop yields,
or even looking at detecting and monitoring greenhouse gases.
In fact, Clark University are adapting this very model for other applications,
including time series segmentation and similarity search.
So in this case, foundation models are multiplying the usefulness of NASA data
where fine tuning can adapt these models to new use cases.
And look, that's just NASA Earth science data.
Those thousands of open source foundation models that I mentioned at the beginning,
they are trained and tuned on a wide variety of other data,
like code generation or foundation models related to a specific industry.
So by selecting the right foundation model and adapting it,
we can put that model to work in new ways to meet our needs.
And that is why there are so many foundation models available
and why there are so many more to come.