Here document

From Wikipedia, the free encyclopedia

A here document (also called a here-document or a heredoc), is a way of specifying a string literal in shells such as Bash, Windows PowerShell and the Bourne Shell, as well as programming languages such as Perl, PHP, Python and Ruby. It preserves the line breaks and other whitespace (including indentation) in the text. Some languages allow variable interpolation or even code to be evaluated inside of the string.

The general syntax is << followed by a delimiting identifier, followed, starting on the next line, by the text to be quoted, and then closed by the same identifier on its own line. Many Unix shells, including the Bourne shell (sh) and zsh, have here documents as a way of providing input to commands.

Contents

[edit] Specific implementations

The following provides an overview of specific implementations in different programming languages and environments. Most of these are identical or substantially similar to the general syntax specified above, although some environments provide similar functionality but with different conventions and under different names.

[edit] Unix-Shells

In the following example, text is passed to the tr command using a here document.

$ tr a-z A-Z <<END_TEXT
 > one two three
 > uno dos tres
 > END_TEXT
 ONE TWO THREE
 UNO DOS TRES

END_TEXT was used as the delimiting identifier. It specified the start and end of the here document. ONE TWO THREE and UNO DOS TRES are outputs from tr after execution.

By default variables and also commands in backticks are interpolated:

$ cat << EOF
 > Working dir $PWD
 > EOF
 Working dir /home/user

This can be disabled by setting the label in the command line in single or double quotes:

$ cat << "EOF"
 > Working dir $PWD
 > EOF
 Working dir $PWD

Appending a minus sign to the << has the effect that leading tabs are ignored. This allows to indent here documents in shell scripts without changing their value.

[edit] Windows PowerShell

In Windows PowerShell here documents are referred to as Here-Strings. A Here-String is a string which starts with an open delimiter (@" or @') and ends with a close delimiter ("@ or '@) on a line by itself, which terminates the string. All characters between the open and close delimiter are considered the string literal. Using a Here-String with double quotes allows variables to be interpreted, using single quotes doesn't. Variable interpolation occurs with simple variables (e.g. $x but NOT $x.y or $x[0]). You can execute a set of statements by putting them in $() (e.g. $($x.y) or $(Get-Process | Out-String)).

In the following PowerShell code, text is passed to a function using a Here-String. The function ConvertTo-UpperCase is defined as follows:

PS> function ConvertTo-UpperCase($string) { $string.ToUpper() }
PS> ConvertTo-UpperCase @'
>> one two three
>> eins zwei drei
>> '@
>>
ONE TWO THREE
EINS ZWEI DREI

Here is an example that demonstrates variable interpolation and statement execution using a Here-String with double quotes:

$doc, $marty = 'Dr. Emmett Brown', 'Marty McFly'
$time = [DateTime]'Friday, October 25, 1985 8:00:00 AM'
$diff = New-TimeSpan -Minutes 25
@"
$doc : Are those my clocks I hear?
$marty : Yeah! Uh, it's $($time.Hour) o'clock!
$doc : Perfect! My experiment worked! They're all exactly $($diff.Minutes) minutes slow.
$marty : Wait a minute. Wait a minute. Doc... Are you telling me that it's $(($time + $diff).ToShortTimeString())?
$doc : Precisely.
$marty : Damn! I'm late for school!
"@

Output:

Dr. Emmett Brown : Are those my clocks I hear?
Marty McFly : Yeah! Uh, it's 8 o'clock!
Dr. Emmett Brown : Perfect! My experiment worked! They're all exactly 25 minutes slow.
Marty McFly : Wait a minute. Wait a minute. Doc... Are you telling me that it's 08:25?
Dr. Emmett Brown : Precisely.
Marty McFly : Damn! I'm late for school!

Using a Here-String with single quotes instead, the output would look like this:

$doc : Are those my clocks I hear?
$marty : Yeah! Uh, it's $($time.Hour) o'clock!
$doc : Perfect! My experiment worked! They're all exactly $($diff.Minutes) minutes slow.
$marty : Wait a minute. Wait a minute. Doc... Are you telling me that it's $(($time + $diff).ToShortTimeString())?
$doc : Precisely.
$marty : Damn! I'm late for school!

[edit] Ruby

In the following Ruby code, a grocery list is printed out using a here document.

puts <<GROCERY_LIST
Grocery list
------------
1. Salad mix.
2. Strawberries.*
3. Cereal.
4. Milk.*
 
* Organic
GROCERY_LIST

The result:

$ ruby grocery-list.rb
Grocery list
------------
1. Salad mix.
2. Strawberries.*
3. Cereal.
4. Milk.*

* Organic

Ruby also allows for the delimiting identifier not to start on the first column of a line, if the start of the here document is marked with the slightly different starter "<<-". Besides, Ruby treats here documents as a double-quoted string, and as such, it is possible to use the #{} construct to interpolate code. The following example illustrates both of these features :

now = Time.now
puts <<-EOF
  It's #{now.hour} o'clock John, where are your kids?
  EOF

[edit] PHP

In PHP, here documents are referred to as heredocs.

<?php
 
$name       = "Joe Smith";
$occupation = "Programmer";
echo <<<EOF

This is a heredoc section.
For more information talk to $name, your local $occupation.
 
Thanks!

EOF;
 
?>

Outputs

This is a heredoc section.
For more information talk to Joe Smith, your local Programmer.

Thanks!

Caution:

The line with the closing identifier may not contain other characters, except possibly a semicolon (;). That means especially that the identifier may not be indented, and there may not be any spaces or tabs after or before the semicolon. It's also important to realize that the first character before the closing identifier must be a newline as defined by your operating system. Closing delimiter (possibly followed by a semicolon) must be followed by a newline too. If this rule is broken and the closing identifier is not "clean" then it's not considered to be a closing identifier and PHP will continue looking for one. If in this case a proper closing identifier is not found then a parse error will result with the line number being at the end of the script.

For more information see heredoc in the PHP manual.


[edit] Perl

In Perl there are several different ways to invoke heredocs. Using double quotes around the tag allows variables to be interpolated, using single quotes doesn't and using the tag without either behaves like double quotes. It is necessary to make sure that the end tag is at the beginning of the line or the tag will not be recognized by the interpreter.

Here is an example with double quotes:

my $sender = "Buffy the Vampire Slayer";
my $recipient = "Spike";
 
print <<"END";
 
Dear $recipient, 
 
I wish you to leave Sunnydale and never return.
 
Not Quite Love,
$sender
 
END

Output:

Dear Spike,

I wish you to leave Sunnydale and never return.

Not Quite Love,
Buffy the Vampire Slayer

Here is an example with single quotes:

print <<'END';
Dear $recipient,
 
I wish you to leave Sunnydale and never return.
 
Not Quite Love,
$sender
END

Output:

Dear $recipient,

I wish you to leave Sunnydale and never return.

Not Quite Love,
$sender

[edit] Python

Python supports heredocs delimited by single or double quotes repeated three times (i.e. ''' or """).

A simple example with variable interpolation that yields the same result as the first Perl example above, is:

sender = 'Buffy the Vampire Slayer'
recipient = 'Spike'
 
print("""\
Dear %(recipient)s,
 
I wish you to leave Sunnydale and never return.
 
Not Quite Love,
%(sender)s
""" % locals())

The Template class described in PEP 292 (Simpler String Substitutions) provides similar functionality for variable interpolation and may be used in combination with the Python triple-quotes syntax.

[edit] Tcl

Tcl has no special syntax for heredocs, because already the ordinary string syntaxes allow embedded newlines and preserve indentation. Brace-delimited strings have no substitution (interpolation):

puts {
Grocery list
------------
1. Salad mix.
2. Strawberries.*
3. Cereal.
4. Milk.*
 
* Organic
}

Quote-delimited strings are substituted at runtime:

set sender "Buffy the Vampire Slayer"
set recipient "Spike"
 
puts "
Dear $recipient, 
 
I wish you to leave Sunnydale and never return.
 
Not Quite Love,
$sender
"

In brace-delimited strings, there is the restriction that they must be balanced with respect to unescaped braces. In quote-delimited strings, braces can be unbalanced but backslashes, dollar signs, and left brackets all trigger substitution, and the first unescaped double quote terminates the string.

A point to note is that both the above strings have a newline as first and last character, since that is what comes immediately after and before respectively the delimiters. string trim can be used to remove these if they are unwanted:

puts [string trim "
Dear $recipient, 
 
I wish you to leave Sunnydale and never return.
 
Not Quite Love,
$sender
" \n]

Similarly, string map can be used to effectively set up variant syntaxes, e.g. undoing a certain indentation or introducing nonstandard escape sequences to achieve unbalanced braces.

[edit] See also

Languages