Download all messages from a Google group

I made a simple scrap utility by using selenium and htmlunit.. you can use it.. it is not very optimized and can help you download messages of small groups only(up-to 7000 msgs)

https://github.com/himukr/google-grp-scraper


Ultimately I ended up using the gdata python library to get a list of all groups along with their respective URLs. From there I used selenium to scrape the groups for messages and all replies. Probably not the best solution but it works for what I need.


If you don't mind using #bash, you may try a tool I wrote

https://github.com/icy/google-group-crawler

It can download all mbox files from Google Group. If you have a cookie file, you can even download all files from a private Google Group, and/or to see all original emails. It can also read rss feeds and fetch the latest posts ; and this is useful for daily mirror.

An example result is here http://l.archlinuxvn.org/archlinuxvn/. MHonArch is used to convert mbox files into HTML format.