AWS Cloudwatch Log - Is it possible to export existing log data from it?

There is also a python project called awslogs, allowing to get the logs: https://github.com/jorgebastida/awslogs

There are things like:

list log groups:

$ awslogs groups

list streams for given log group:

$ awslogs streams /var/log/syslog

get the log records from all streams:

$ awslogs get /var/log/syslog

get the log records from specific stream :

$ awslogs get /var/log/syslog stream_A

and much more (filtering for time period, watching log streams...

I think, this tool might help you to do what you want.


The latest AWS CLI has a CloudWatch Logs cli, that allows you to download the logs as JSON, text file or any other output supported by AWS CLI.

For example to get the first 1MB up to 10,000 log entries from the stream a in group A to a text file, run:

aws logs get-log-events \
   --log-group-name A --log-stream-name a \
   --output text > a.log

The command is currently limited to a response size of maximum 1MB (up to 10,000 records per request), and if you have more you need to implement your own page stepping mechanism using the --next-token parameter. I expect that in the future the CLI will also allow full dump in a single command.

Update

Here's a small Bash script to list events from all streams in a specific group, since a specified time:

#!/bin/bash
function dumpstreams() {
  aws $AWSARGS logs describe-log-streams \
    --order-by LastEventTime --log-group-name $LOGGROUP \
    --output text | while read -a st; do 
      [ "${st[4]}" -lt "$starttime" ] && continue
      stname="${st[1]}"
      echo ${stname##*:}
    done | while read stream; do
      aws $AWSARGS logs get-log-events \
        --start-from-head --start-time $starttime \
        --log-group-name $LOGGROUP --log-stream-name $stream --output text
    done
}

AWSARGS="--profile myprofile --region us-east-1"
LOGGROUP="some-log-group"
TAIL=
starttime=$(date --date "-1 week" +%s)000
nexttime=$(date +%s)000
dumpstreams
if [ -n "$TAIL" ]; then
  while true; do
    starttime=$nexttime
    nexttime=$(date +%s)000
    sleep 1
    dumpstreams
  done
fi

That last part, if you set TAIL will continue to fetch log events and will report newer events as they come in (with some expected delay).