Comments in programming languages (2024)

This paper covers the history anduse of comments in programming languages, from the beginning of programming tothe present day. Comments in many programming languages are discussed includingmodern languages such as C, Java, scripting languages, and older languages suchas Ada, COBOL, and FORTRAN. Design issues, types of comments, andproblems with comments are illustrated.

Comments

Comment Design Issues

Full-Line Comments

COBOL Comments

Position of Comment Indicator

End-of-Line Comments

Block Comments

Syntax of Comments

Placement of Comments

XHTML Comments

Nested Comments

Comments for Backward Compatibility

Comments for Hiding Code

Mega-Comments

Questions

Answers:

Please send suggestions and comments to dvantassel@gavilan.edu

Comments are used in a programming language to document theprogram and remind programmers of what tricky things they just did with thecode, or to warn later generations of programmers stuck with maintaining somespaghetti code. While comments may seem tobe a minor issue in a language, an awkward comment format in a language is anuisance and can be a source of nasty errors. The content of a comment ishandled as if it were not there by the compiler. Examples of modern-daycomments are:

max = 100;// using default size.

/* check input for valid values and

print error message for accounting ifproblems. */

We have two types of commentshere, the end-of-line comment and the blockcomment. An end-of-line comment terminates at the end of the line. A block linecomment has a terminator and can continue for several lines, or be less thanone line.

Comments were called REMarks inBASIC. COBOL used a NOTE among other types of comments. ALGOL 60 used thereserved word comment to start acomment and the semicolon to terminate the comment.

Comment Design Issues

There are a few commentdesign issuesfor us to consider. Some are:

Where do comments start? Do they start any place or at a particular column<![if !supportFootnotes]>[1]<![endif]>? Early COBOL, BASIC, and FORTRAN started comments at a particular position.
How are comments ended? Obvious choices are at the end of the line, or with a comment terminator like Java */.
Can comments nest? If so, exactly how does the syntax work?
How can we comment out a hundred lines of code that has comments when we want to do testing or debugging?

Some of these issues have answers in modern languages butother of the issues are still unresolved.

Full-Line Comments

In FORTRAN, BASIC, and COBOL languages,commentsare fulllines; and each comment is begun by a specific commentmark in a fixed position on the line. InBASIC, REMark lines startwithREM.

010REM FIND PRIME NUMBERS LESS THAN100

020REM BY DENNIE VAN TASSEL

030REM JULY 4, 1965

040LET A = 1

The same thing would be done in FORTRANas follows:

C FIND PRIME NUMBERS LESS THAN 100

C BY DENNIE VAN TASSEL

C JULY 4, 1957

A = 1

A FORTRAN comment is indicated by a Cin position 1, and only works if the C is inposition 1. The comment takes the entire line. In these early languages,programming was done with cards so there was an obsession with lines (cards) and the beginning and end ofcards (lines)thatpresent generation programmers cannot understand. Multiple-line statements ortwo statements on the same line were not imaged. Since both BASIC and FORTRAN used single linesfor their statements, it is not surprising they used the same convention forcomments. With these full-line comments,they are used on separate lines before or after code that needs to becommented.

COBOL Comments

COBOLhas a similar style of comments. An asteriskhas to be put in position 7, and then the restof the line is a comment. COBOL labels have to start in position 8 or later,and COBOL statements have to start in position 12 or later. Here are how commentswould look in COBOL:

010010* FIND PRIME NUMBERS LESS THAN 100

010020* BY DENNIE VAN TASSEL

010030* JULY 4, 1959

010035 START-LOOP.

010040 MOVE1 TO A.

In abovecode the numbers in position 1-6 were the used for page number (i.e. 010) inpositions 1-3 and card number (i.e. 040) in positions 4-6 for the last line ofthe above code. Positions 73-80 were often used to indicate the name of theprogram, so most comments would end by position 72. A good 1960 COBOL (orFORTRAN) compiler could indicate if cards were out of sequence. In FORTRAN thissame numbering scheme was used, but the numbers were in positions 72-80.

Position of Comment Indicator

BASIC, FORTRAN, and COBOL have two common characteristicsfor their comments. First, comments terminate at the end of the line. Second,the comment indicator was in a particular position. The 80-column cards made aparticular column meaningful. All of these languages were very column oriented.We can call this type of commenta positionalcomment, since it must start in aparticular position. Table x.1 describes this type of comment.

Language	Comment Syntax
FORTRAN	C in position 1
BASIC	REM at beginning of the line
COBOL	* in position 7

Full-Line Comments

Table x.1

Notice that all three of these languages are very old. Whenthese languages started, computers had memory of 4K or 8K, which is probably lessthan your toaster. Knowing where comments had to start made it easy for earlycompilers to find the comment and dispose of it easily. The compilers neededall the help they could get. So if a compiler knew that all FORTRAN commentshad to have a C in position 1, then it was easy to find the comments. Then thecompiler could ignore that line. If you look at modern languages where thecomment can start at any place on the line and end at any place on the line, agood portion of that available 4K would have been necessary just for processingcomments!

End-of-Line Comments

With assembly languagewe have two improvements in comments. First, the commentdo nothave to be indicated in position 1; the comment could start in a laterposition. Second, the line could have useful commands or instructions to theleft of the comment. Assembly language starts a comment with a semicolonany place on the line. Here is how commentscan look in assembly language:

; FIND PRIME NUMBERS LESS THAN 100

; BY DENNIE VAN TASSEL

; JULY 4, 1954

MOV C, 1 ; SET COUNT TO 1FOR THE STARTING VALUE.

Now we do not have to start the comment in a particularposition. The MOVecommand has a comment on the same line as the move command. These commentsstill terminate at the end of the line and are called end-of-line comments. We have expanded our commentcapability quite a bit, especially since we can have useful commands on thesame line to the left of the comments. Table x.2 illustrates end-of-linecomments in several languages.

Language	Comment Syntax
ALGOL 60	; (semicolon)
Assembly Languages	; (semicolon)
Ada, mySQL	--(two dashes)
C++/Java	//(two slashes)
FORTRAN 90	!(exclamation mark)
Perl, TCL, UNIX Shell, mySQL	#(hash sign
Visual Basic .NET	'(apostrophe)

End-of-Line Comment

Table x.2

Block Comments

When we get into languages with multiple-line programmingstatements we find commentsthat can be multi-line or in-line comments. These commentsare not concerned with line boundaries. Thereare two needs not addressed by the previous two types of comments. We may wanta short comment in the middle of some code (an in-line comment) or we may wantcomments that are several lines long. Wanting or needing short in-line commentsin the middle of a line requires a comment with delimiters. Multiple-linecommentscan be done with several full-line comments.

Here is what some of the languages use for block comments:

Language	Comment Syntax
ALGOL	comment "ends with" ;
Pascal	(* . . . *) or { . . . }
Many languages	*/ . . . /*
Forth	( . . . )
HTML	<!-- . . . -->
Haskell	{- -}

Block Comments

Table x.3

ALGOLstarts a comment with the word commentand ends the comment with the first semicolonit finds. Early Pascal used (* and *) for comments since they only had roundparentheses on keyboards back then. After brackets were added to inputkeyboards brackets were allowed for comments.

It is obvious that the C style comments have won, but theycame from B (ALGOL?? which one or did I make this up). "Multiple-line comments"is not quite correct terminology since these comments can be on only one linewith commands on either side as follows:

sum = 0; /*initialize variables */ max = 100;

But if you do something like this, you need to be punishedin some way. There does not seem to be a good name for this type of comment.The comment can be before a command, in the middle of a command, after acommand, or be several lines long. The best terminology seems to be to call ita block comment, and that is what it is called in some textbooks.

When Ada was designed, both blockand end-of-line comments were in common usage. But Adahas only one type of comment. Ada uses two dashes(--) to start a comment that ends at the end of the line. My guess is Ada designers did not feel the benefit of block commentswas greater than the problem of run-away comments (not closing a blockcomment).

Syntax of Comments

Notice that some languages (assembly, FORTRAN 90, and Perl) start comments with only one character, but otherlanguages (Ada, C++) use two characters to start acomment. Using two characters to start a comment such as // or /* helps preventthe accidental starting of a comment such as the semicolon single character (;)in assembly language and the exclamation (!) in FORTRAN 90. One otherobservation is we need to use two characters that will not otherwise have ameaning in the language. The double slash // does pretty well in this context,but the /* does not do quite as well. For example in C we use a lot ofpointers. Suppose we have a pointerptr, and want to use*ptr to get the contents of the address being pointedat. Then you (not me!) might type the following line:

a =1/*ptr + 4.3;

Do you see any problem here? The "/*ptr+ 4.3" looks a lot like the start of a comment. This is one of the few placesin C where a spaceis significant. So, we need to change the lineto:

a =1/ *ptr + 4.3; or

a =1/(*ptr) + 4.3;

The last version using parentheses is probably better, butI needed a place to show that wonderful example of where a space is importantin that first line.

A second lesser problem with a 2-character commentdelimiter is that an extra space between the two-character comment delimiterwill cause the comment to be missed. For example:

x_ptr= x / *ptr

So is the above trying to do division or was an accidentalspace put after the slash and before the asterisk of a comment? This problemwill probably be caught be the compiler, or at least I have not been able tocome up with an example where the compiler would not find it.

Placement of Comments

Where can comments be placed? Can comments go before orafter the program? In most modern languages comments can go before or after theprogram. But in XML comments are not allowed before the first statement. Inmost languages a comment can go any place a space would occur except within acharacter string or within another comment.

While the previous rule is a common description of wherecomments can go, it is not quite correct. A comment cannot be placed where itwould hide the start or end of a block comment. For example:

/* Dennie Van Tassel

wrote this niceprogram

// with great skill and few smarts. */

So were you smart enough to see what was wrong with theabove comment? The last line has an end-of-line comment that hides the endingof the block comment. I am sure you saw it.

XHTML Comments

XHTML has similar potential problems. For example, XMLcommentscannot go within declarations, tags, or othercomments. Also, since XMLand HTML use a paired command structure, wemust be careful not to mess up the pairing. All of the following are illegal inXML:

<!--

<x12>

Illegalsince messes up pairing of x12 tag-->

</x12>

<!--

<B12>

</B12>

-->

There are several commenting errors in the above XML code.On the first line we have a comment in the tag <A00>, which is notallowed. Next, we start a comment on the line before the tag <x12>, whichhides that tag. In the last four lines we have comments inside comments (nestedcomments) which is also not allowed. Comments inside comments (nested) wouldoften be useful and this topic is discussed next.

Nested Comments

One serious problem with multiple-line commentsis forgetting to terminate a comment. In C++we could often have something like this:

/* set variables

a = 0;

/* set maximum size */

maxs = 100;

What is incorrect with the above code? Go back and look atit again. If you missed the error this shows how easy that error is to miss.The first comment was erroneously not closed so the statement "a = 0;" gets eatedup(orswallowed up) in a comment.

This type of error, called a run-away comment, is a verydifficult bug to locate! Comments that are stopped at the end of the line avoidthis problem. This problem is the primary reason people argue that multipleline comments are a bad option. Thus there is some debate whether multiple-linecomments are a good or bad idea.

Ada does not have multiple-linecomments. Instead their comment starts with two dashes and terminate at the endof the line. I imagine they decided against having multiple-line comments toavoid the problem of run-away comments.

There are several solutions to this problem of nestedcomments. One is the compiler can warn about all nested comments, that is acomment that has a "/*" in it. The second solution is to forbid nestedcomments, which is done in some languages, including C++. If a comment startswith a /* then there cannot be another /* in the comment. Another method foravoiding the error of not terminating a comment is for the compiler to checkfor statement terminators (the semicolon) in comments and provide a warning.

In the above incorrect code, the semicolon at the end ofthe line "a = 0;" would generate a warning message by the compiler. Otherwise,we can allow nested comments, but the compiler can indicate any comments thatdo not nest properly, and warn about nested comments. Different languages usedifferent approaches and each approach seems to have its own benefits anddrawbacks.

For example, in XHTML, comments are opened with , but otherwise, we cannot insert two consecutive dashesin the comment. Thus

<!-- setvariables

<b>careful</b>

<hr>

is a syntax error in this language. So one may jump to theconclusion that nested comments should be outlawed.

But there is another opinion. Besides that nested commentsare useful, neat, and elegant, there is another good reason for wanting them. Whenwe need to comment out statements that have comments:

/* comment out for testing

a = 0;

/* set parametersfor end of year */

months = 12;

Comments for Backward Compatibility

With the web, comments are used to make code backwardcompatible which is a difficult task since we cannot change history except inscience fiction and politics. We use comments to hide JavaScript code from oldweb browsers as follows:

<script>

<!--

JavaScript code here

//-->

</script>

HTML comments start with . In theabove code the second line "" immediately before the closing </script> command. Now the new browsersare instructed to ignore comments inside script blocks. Thus the JavaScriptcode gets used. The old browsers see a comment and do not process any of theJavaScript commands because they think all that is just a comment. Otherwise,these JavaScript commands might cause errors for the browser.

When we add JavaScript or Cascading Style Sheets to a webpage, we also require the closing comment to start on a new line with // andthen -->. The // is a regular single-line comment in JavaScript. So the //is used to comment out the closing -->, otherwise we would have a syntaxerror in our JavaScript program. This use of comments is quite new and quitecomplicated. Some very clever people figured out all this!

Comments for Hiding Code

While comments are needed for documenting a program,comments are also used to hide codethat is needed for debugging or testing butnot for production. Here is a commented-out statement:

// cout <<"count= " << count << endl;

The above line is useful for debugging, but not needed forproduction. Often the best thing to do is leave the debugging or testingstatements in the program but comment them out. Commenting out code is not justfor debugging or testing. A half-implemented procedure in a production versioncan be left alone in the program by commenting it, without having to remove andthen re-add the code later.

Mega-Comments

We need a fourth type of comment, a mega-comment, that can be used to commentout code that contains regular comments. Few languages have this category. XMLhas come up with a mega-comment to comment out code that avoids the problem ofnested comments:

<![IGNORE[

DTD. . .

]]>

which will ignore the DTD line. Any other type of commentcan be enclosed within the IGNOREblock. Then we want it included, we change the first line to

<![INCLUDE[

and the code is included. This allows us to document whatcode is needed for debugging/testing and to switch back and forth easily. Thistype of XML comment can also be nested.

Thus we have four types of comments. They are

Full-line comments
End-of-line comments
Block or multiple-line comments
Mega-comments

Few languages have all four categories. Both full-line andend-of-line comments can be done the same way since all they need is a startingindicator since they both terminate at the end of the line. Block comments havea way to indicate the beginning and ending (the delimiters) of the comments.Thus these comments can be used for short comments in the middle of a line ofcode or for multiple-line comments. Mega-comments are presently rare inlanguages, but very useful for commenting out code with comments in the code.

In languages that have single-line comments andmultiple-line comments, a careful programmer can create her own mega-comment.For example, in C++ we could only use the single line comments with the //.Then use the /* . . . */ for commenting out lines of code. This avoids theproblem of nested comments.

Questions

When the Pascallanguage was first used the keyboards had only a limited character set, so they used (* . . . *) to indicate comments. Soon more characters were added to the keyboards and { } were added. One observant computer scientist (CS) noticed these characters, and Pascal comments were then allowed to use { comments here }. Now we only need to type one character to start a comment instead of two. Shall we give the person who suggested this the CS award of the year? Or shall we tell her or him, it was a bad change? Hint: what happens if a programmer types this single character by mistake?
Go back and read the earlier section on "Nested Comments." In a couple of programming languages that you know, see if the compiler catches nested comments. There are two ways the compiler can find them: the start of another comment, or a statement terminator in the comment. What happens on your compiler? Do you get warnings or errors?
So now design the comment system for OPL. If you allow multiple-line comments, how will you prevent the problem of forgetting to close a comment? How will you start and terminate comments?
Bjarne Stroustrup has this interesting example of rare code<![if !supportFootnotes]>[2]<![endif]> where C and C++ interpret the code differently due to the comments:

int b =a//* divide by 4 */4;

-a;

When comments are deleted, what isthe result with C and C++. Reminder that C++ has // comments and C does not.

Should we forbid nested comments in OPL? Give some arguments for and against allowing nested comments.
Suppose we decide to allow nested comments. What problems do we need to solve? Set up a couple of examples.
Should we have some mega-comments in OPL that can be used to comment out lines of code, which may have other types of comments? Design a mega-comment for OPL.
The designer of C++, Bjarne Stroustrup, states that block comments do not nest. So what does this mean for compiler writers? Should the compiler issue a warning message and keep going, or issue a severe error and stop compiling. Try some nested comments in a couple of different languages (C++ and Java would be ok), and see what happens.
Ada only has end-of-line comments, probably to avoid run-away block comments. How do you feel about that design decision? Give some reasons for and against their decision.
JavaScript on the web uses XHTML comments to hide code from old browsers. Two dashes are used to start and end the comments. Are we allowed to use two dashes inside the JavaScript code? What happens if we have a complete XHTML comment inside the JavaScript code?

Answers:

1. My general impression is we want to use two characters tostart and stop a comment to avoid the problem of starting a comment byaccident. Thus /* is a nice way to start a comment but the single character !of FORTRAN 90 is questionable. Likewise, we want to use two characters to end acomment if the end of the line does not end the comment. And we want the endingand starting symbols for comments different. But all my opinions here may bewrong.

4. In C++, when we remove comments we get:

intb = a -a;

But in C when we remove comments we get:

intb = a/4; -a;

5. If we have many lines of code we want to comment out fortesting or debugging, then those lines may include comments. So one approachseems to be to allow nested comments, but have the compiler warn about theoccurrences. For: 1. Can comment outcode that contains comments. 2. Seemsneat or elegant.

Against: 1.Source of nasty errors. 2. Not a statement (like if-then-else) that needs to nest, and many other items do not nest,such as strings.

6. We need to set up rules for ending nested comments, or weend up with dangling comment closers. The problem seems similar to solving thenesting of if-then-else statementsand matching the else with theclosest previous then.

7. But then XML has invented a special mega-comment usedjust for commenting out statements with comments. This is an interestingapproach.