Hello all,
Please join us for another Operate First Data Science Community Meetup on
Tuesday, *April 19th, 2022 at 11:00 ET*. In this talk, @Selbi Nuryyeva
<selbi(a)redhat.com> will explain how one can distribute a machine learning
workflow across several nodes with GPU hardware in a cloud environment. She
will use PyTorch to carry out the ML training and Kubeflow, Node Feature
Discovery, and GPU operators to distribute the ML workload.
As the datasets and models get bigger, the demand for more powerful and
efficient GPUs is rapidly increasing. Oftentimes a single GPU is not
adequate for an ML use case. An alternative to upgrading the GPU hardware
is to distribute the ML workload either across several GPUs on one node, or
across multiple nodes each containing one or several GPUs. The ability for
the latter is especially preferred when a single machine can fit only so
many GPUs. The attendees will understand how to overcome the GPU hardware
limits of a single node training by taking advantage of GPUs on other
machines, and therefore, maximizing the utilization of GPUs in an open
cloud environment.
Subscribe to our calendar
<
https://calendar.google.com/calendar/u/2?cid=N3QyMm1ydm92amNmdTZqZm5ucDRu...
for event details.
To learn more about the Operate First Data Science Community, visit our
website
<
https://www.operate-first.cloud/data-science/operate-first-data-science-c...
and find the agenda for the upcoming meetups here
<
https://www.operate-first.cloud/data-science/operate-first-data-science-c...;.
We look forward to seeing you at the Meetup!
Best,
The Operate First Data Science Team
Links:
1. Calendar Invite:
https://calendar.google.com/calendar/u/2?cid=N3QyMm1ydm92amNmdTZqZm5ucDRu...
2. Topic suggestion:
https://github.com/aicoe-aiops/cloud-first-data-science-community/issues/...
3. Website:
https://www.operate-first.cloud/data-science/operate-first-data-science-c...
4. Agenda:
https://www.operate-first.cloud/data-science/operate-first-data-science-c...
5. Slack:
https://join.slack.com/t/operatefirst/shared_invite/zt-o2gn4wn8-O39g7sthT...
6.
Mailing List:
https://lists.operate-first.cloud/admin/lists/community.lists.operate-fir...
7. Privacy Policy:
https://www.operate-first.cloud/data-science/operate-first-data-science-c...
--
Thanks and Regards,
Aakanksha Duggal
She/Her
Software Engineer
Red Hat Boston <
https://www.redhat.com/>
AI Center of Excellence
Office of the CTO
aduggal(a)redhat.com
<
https://www.redhat.com/>