Get both the headers and the body of a curl response in two separated variables?

head=true
while IFS= read -r line; do 
    if $head; then 
        if [[ -z $line ]]; then 
            head=false
        else
            headers+=("$line")
        fi
    else
        body+=("$line")
    fi
done < <(curl -sD - "$url" | sed 's/\r$//')
printf "%s\n" "${headers[@]}"
echo ===
printf "%s\n" "${body[@]}"

To join the elements of an array into a single scalar variable:

the_body=$( IFS=$'\n'; echo "$body[*]" )

In bash 4.3, you can use named references to simplify switching from "header" mode to "body" mode:

declare -n section=headers
while IFS= read -r line; do
    if [[ $line = $'\r' ]]; then
        declare -n section=body
    fi
    section+=("$line")
done < <(curl -sD - "$url")

For some reason, glenn jackman's answer did not catch the body part of the response. I had to separate the curl request into another command expansion and then enclose it in double quotes. Then I did not use arrays, but simply concatenated values to the variables. This works for me:

output=$(curl -si -d "" --request POST https://$url)

head=true
while read -r line; do 
    if $head; then 
        if [[ $line = $'\r' ]]; then
            head=false
        else
            header="$header"$'\n'"$line"
        fi
    else
        body="$body"$'\n'"$line"
    fi
done < <(echo "$output")

Thank you, Glenn!


I would like to share a way to parse curl response without any external program, bash only.

First, get the response of a curl request passing -sw "%{http_code}".

res=$(curl -sw "%{http_code}" $url)

The result will be a string containing the body followed by the http code.

Then, get the http code:

http_code="${res:${#res}-3}"

And the body:

if [ ${#res} -eq 3 ]; then
  body=""
else
  body="${res:0:${#res}-3}"
fi

Note that if the length of http_code and response are equal (length 3), body is empty. Else, just strip out the http code and you get the body.


Inspired by 0x10203040's answer, the following script puts the response headers into one variable, and the response body into another. I'm a noob at this, so I've probably done something unwise/inefficient; feel free to offer suggestions for improvement.

# Perform the request:
# - Optionally suppress progress output from the terminal (-s switch).
# - Include the response headers in the output (-i switch).
# - Append the response header/body sizes to the output (-w argument).
URL="https://example.com/"
response=$(curl -si -w "\n%{size_header},%{size_download}" "${URL}")

# Extract the response header size.
headerSize=$(sed -n '$ s/^\([0-9]*\),.*$/\1/ p' <<< "${response}")

# Extract the response body size.
bodySize=$(sed -n '$ s/^.*,\([0-9]*\)$/\1/ p' <<< "${response}")

# Extract the response headers.
headers="${response:0:${headerSize}}"

# Extract the response body.
body="${response:${headerSize}:${bodySize}}"

Explanation

curl – transfer a URL

--include

Use the --include (-i) option to include the response headers in curl's stdout, ahead of the response body.

--write-out

Use the --write-out (-w) option to append some useful things onto the end of curl's stdout:

  • \n (a new line so sed can process the header/body sizes separately)
  • %{size_header},%{size_download} (response header size and response body size, respectively, separated by a comma or something)
... -w "\n%{size_header},%{size_download}" ...

--silent (optional)

Depending on the type of request, curl might output progress updates to stderr in your terminal. It won't affect stdout, but you can suppress it using the --silent (-s) option. (See here for more information.)


Use command substitution ($(...)) to execute curl in a subshell and get its entire stdout (response headers and body) into a single $response variable for now; we will extract the headers and the body from it separately after the fact.

response=$(curl -si -w "\n%{size_header},%{size_download}" "${URL}")

$response should contain something like this:

HTTP/2 200 
age: 384681
cache-control: max-age=604800
content-type: text/html; charset=UTF-8
date: Tue, 03 Nov 2020 06:54:45 GMT
etag: "3147526947+ident"
expires: Tue, 10 Nov 2020 06:54:45 GMT
last-modified: Thu, 17 Oct 2019 07:18:26 GMT
server: ECS (ord/4CB8)
vary: Accept-Encoding
x-cache: HIT
content-length: 1256

<!doctype html>
<html>
...
</html>

331,1256

Notice the header,body sizes alone on the last line, thanks to the -w option.

sed – stream editor

... sed -n '$ s/^\([0-9]*\),.*$/\1/ p' ...
... sed -n '$ s/^.*,\([0-9]*\)$/\1/ p' ...

--silent

Use the --silent (--quiet/-n) option to prevent sed from outputting every line that passes through it.

$ address

Use the $ address to process only the last line (the header/body sizes).

s command

Use the s command to perform a substitution. Basically, use a regex to match the line containing the header/body sizes, then substitute that entire line with just the header size or the body size, then output the substituted line, i.e. output just the header size, or just the body size:

  1. s: the substitution command.
  2. /: begin the regex search pattern.
  3. ^: match the beginning of the line.
  4. \(: begin a group to contain the header size.
  5. [0-9]*: match 0 or more digits (i.e. the header size).
  6. \): finish the group containing the header size.
  7. ,.*: after the group, match a comma, followed by 0 or more of any character (to match the rest of the line).
  8. $: match the end of the line.
  9. /: finish the regex search pattern; begin the replacement pattern.
  10. \1: replace the matched text (i.e. the entire line) with the group (i.e. just header size).
  11. /: end the replacement pattern.
s/^\([0-9]*\),.*$/\1/

Above is for the header size, which is before the comma; below is similar for the body size, which is after the comma.

s/^.*,\([0-9]*\)$/\1/

p command

Use the p command to output the processed line despite the -n option.


Use a "here-string" (<<<) to pass curl's $response to sed's stdin. Since we suppressed sed's output (-n), processed only the last line ($), substituted (s/...) the entire line with either the header size or the body size, then output it (p), sed should output just those, respectively. And again use command substitution to put them into $headerSize and $bodySize variables.

headerSize=$(sed -n '$ s/^\([0-9]*\),.*$/\1/ p' <<< "${response}")
bodySize=$(sed -n '$ s/^.*,\([0-9]*\)$/\1/ p' <<< "${response}")

Finally, now knowing the size of the header and body, use parameter substring expansion (${variable:offset:length}) to pull the response headers and response body into separate $headers and $body variables.

headers="${response:0:${headerSize}}"
body="${response:${headerSize}:${bodySize}}"

Troubleshooting

Works for me in macOS High Sierra 10.13.6 using the following:

  • GNU bash version 3.2.57(1)-release (x86_64-apple-darwin17)
  • curl 7.54.0 (x86_64-apple-darwin17.0) libcurl/7.54.0 LibreSSL/2.0.20 zlib/1.2.11 nghttp2/1.24.0

If you're counting characters, the following things tripped me up:

  • HTTP headers have CR+LF for EOLs, so that's 2 bytes for each EOL.
  • Some code editors (such as Visual Studio Code) will summarily normalize line endings, so what ends up in your editor might not be exactly the same as what curl actually output.
  • Some character encodings (such as UTF-8, etc.) use multiple bytes to represent certain individual characters, so there might be fewer characters than the size of the header/body in bytes; similarly for control characters/other non-printing characters, etc. that might not appear in text/code editors.

So it might be better to use a hex-editor rather than a text-editor for troubleshooting.

Tags:

Bash

Curl