C Basic Syntax - Statements, keywords, identifiers & comments Tutorial

In this C tutorial we learn what code statements are. We cover special reserved keywords in C and how to name our data containers, functions, macros and structures.

Finally, we cover how to define scope in C and how to document our code with comments.

Statements

A statement is a line of code that the compiler can execute. In C, a statement is usually on its own line and terminated by a semicolon ( ; ).

Example:
char message[] = "Hello World";

const float PI = 3.14;

#define MAX(a,b)

In the example above, each separate line of code is a single statement.

We could place these statements one after the other on the same line. The compiler doesn’t care about whitespace beyond single spaces.

Example:
 char message[] = "Hello World"; const float PI = 3.14; // etc...

The compiler may not care that our statements are all on a single line, but it’s still bad practice. When statements are on their own separate lines, the code becomes much cleaner and easier to read.

We can place multiple statements on a single because statements in C are usually terminated in some way.

For example, a statement can be terminated by a newline, a semicolon operator ( ; ), or a closing curly brace ( } ).

Example:
// Terminated with newline
#define MAX(a,b)

// Terminated with a ;
float variable = 1;

// Terminated with a }
int function()
{
	return 0;
}

This is similar to other languages in the C family like C# and C++, as well as Java and Javascript.

If you’re coming from a language like Python that doesn’t use a semicolon to terminate statements, it’s important to remember to include them when needed.

We cover each statement terminator in the relevant lessons.

Keywords

A keyword is a reserved word in C that has special meaning to the compiler. We are not allowed to use these words as names for our variables, constants, arrays, functions etc.

The C language has 32 keywords. You don’t have to memorize them all right now, we will cover them in more depth in the rest of the course.

autobreakcasecharconstcontinuedefaultdo
doubleelseenumexternfloatforgotoif
intlongregisterreturnshortsignedsizeofstatic
structswitchtypedefunionunsignedvoidvolatilewhile

C also has a set of contextual special characters that has special meaning to the compiler in certain situations.

,<>._();$:
%[]#?'&{}"
^!*/|-\~+=

As an example, let’s look at the array we initialized when we wrote our first application.

Example:
 char message[] = "Hello World";

In this example, char is our keyword. It indicates that the message data container may only hold a set of ASCII characters.

As a reserved keyword, we cannot use the word char as a name for our data container.

Example:
#include <stdio.h>

// My first C program
int main()
{
    char char[] = "Hello World";

	printf(message);

	return 0;
}

If we Build > Build and run the example above, the compiler will raise an error in the Build Messages tab.

Output:
 C:\KHQC\LearningC\main.c|6|error: expected identifier or '(' before '[' token|

The compiler expected an identifier (a name) before the left square bracket ( [ ), but instead it got a keyword.

If we follow the rest of the statement, we see that we also have some special characters. These have special meaning to the compiler as well, but they may differ in certain contexts.

  • The square brackets indicate that our data container will be an array.
  • The = is the assignment operator and will assign whatever is on the right to whatever is on the left.
  • The double quotes that surround the words “Hello World” indicate that it’s a string or characters.
  • Finally, the semicolon ( ; ) terminates the statement.

We cover these special character operators in the tutorial on operators .

Identifiers

An identifier is the name that we give our data containers, functions, structures etc.

As an example, let’s consider the data container we worked with in our first app, called message.

Example:
 char message[] = "Hello World";

In this case the identifier would be the word message. It’s the custom word we use to refer to the data container.

We can call it something else, but, we have to remember that there are certain rules that we must follow when naming in the C language.

Naming rules

1. A name may not start with a numerical value, but may contain a number inside of it.

Example:
// Not allowed
21_Jump_Street;

// Allowed
mambo_number_5;

2. A name may start with uppercase or lowercase alphabetical letters (A - Z or a - z), and underscores ( _ ).

Example:
// Allowed
Name;
name;
_name;

3. A name may not start with or contain special characters such as $, @, % etc.

Example:
// Not allowed
^_^;
n@me&surname;
#lostsock#thestruggleisreal;

4. While there is no rule in the C language about how long an identifier can be, some compilers misbehave when identifiers are longer than 31 characters.

If we follow the rules above, we can choose any identifier we want. However, we should give meaningful names to identifiers so that our code makes sense and reads easier.

Example:
// Bad identifier
char a[] = "Good day";

// Better identifier
char welcomeMessage[] = "Good day";

Conventional identifier casing

All programming languages have conventions when it comes to identifiers and their casing.

For example, Python has a strict convention (outlined in the PEP-8 document) of naming variables and functions with snake_case, while classes use PascalCase.

Most languages, including the C family, are less strict with their conventions as long as it remains consistent. We’ll go through some of the conventions below.

Snake case

Snake case is where all the words of an identifier is lowercase and each word is separated by an underscore.

If the identifier is only a single word, it’s all in lowercase.

Example:
char welcome_message[] = "Hello World";

char message[] = "Hello World";

Studies have shown that snake casing is the easiest to read, however, most developers don’t like snake casing and prefer camel case.

Camel case

Camel case is where the first word of an identifier is lowercase and all subsequent words are capitalized. The words are not separated like with snake casing.

If the identifier is only a single word, it’s all in lowercase.

Example:
char[] welcomeMessage = "Hello World";

char[] message = "Hello World";

Pascal case

Pascal case is where every word is capitalized, even if there is only one word.

Example:
char WelcomeMessage[] = "Hello World";

char Message[] = "Hello World";

Uppercase

Uppercase is where each word is in full uppercase, and words are separated by an underscore.

Example:
char WELCOME_MESSAGE[] = "Hello World";

char MESSAGE[] = "Hello World";

C Programmers are usually divided into two groups on their casing preference.

Group 1:

  • Variables and functions are written in camel case.
  • Structs are written in pascal case.
  • Constants and macros are written in uppercase.

Group 2:

  • Variables, functions and structs are written in snake case.
  • Constants and macros are written in uppercase.

We belong to the first group, and will use that as our convention throughout this course.

You may be required to follow certain conventions at your workplace or when collaborating on a project. But, for the most part it will be up to your personal preference which casing and bracing conventions you use.

The important part is that you pick one and stick with it, stay consistent.

Scope and indentation

We will often need to define several statements that are related in some way, into a code block.

In C, we define a code block with an open and close curly brace, the left and right curly braces respectively.

Example:
int printMessage()
{	// this is
	// a code
}   // block

Everything inside a code block is considered to be of that scope. When code is of a scope, it cannot be used outside of that scope (in most cases).

Example:
#include <stdio.h>

int main()
{
    printf(message);

	return 0;
}

int printMessage()
{
    char message[] = "Hello World";

    return 0;
}

In the example above, we declared message inside the scope of the printMessage() function. When we try to access it outside of the function, the compiler will raise an error and the code will not be executed.

We will revisit the concept of scope again in the tutorials on functions, conditional statements and loops.

To keep our code clean, we use indentation when we define scope. For example, all the code inside a function is indented one level to indicate that it’s in the function’s scope.

It’s not required to indent our code when defining scope, but it is considered a good practice convention and makes the code easier to read.

Example:
int printMessage()
{
	// this code is indented
	// one level and makes
	// it easier to read
}

In general, a Tab consisting of four spaces is considered one indentation level. Most IDEs will have a four space tab is default.

Developers in C will define scope in different ways. Some developers like to have the opening brace in a separate line, others like the opening brace on the same line as the statement.

Example:
int printMessage() {
	// open brace on
	// the same line
	// as the statement
}

int printMessage()
{
	// open brace on
	// its own separate
	// line
}

Both conventions are valid. In some situations you may be forced to follow one convention over another, but for the most part it will be up to your personal preference. Again, pick whichever one you like best, but stay consistent.

In this course we’ll be writing the opening brace on its own line.

Comments

Commenting is a way for a developer to document their code to enhance its readability. Comments in C are ignored by the compiler, which means the compiler won’t try to execute any text inside a comment.

Our code should always be written cleanly and clearly enough that it’s obvious as to what the code does. However, there are many situations where we should include comments in our code.

  • When we’re learning a new language, we’re not yet used to the syntax or how the language operates. In this case it’s recommended to comment code heavily in your own words.
  • When we’re working on projects as part of a group it’s helpful to comment sections of code so that other programmers immediately understand what’s going on. They won’t need to read through the code to figure out what it’s purpose is.
  • When we use external libraries we should comment on anything that’s not immediately obvious so that it won’t be necessary to go back and forth to the documentation for lookup.
  • When we have a section of complex code, we should comment what it’s doing or what it’s for.
  • When we want to temporarily disable a piece of code for debugging etc. we can simply comment it out.

There are two types of comments supported in C.

  • Single line comments.
  • Multi-line comments (comment blocks).

Single line comments

A single line comment will work only up to the end of that line. Once a new line is encountered, the comment will end.

To write a single line comment, we prefix any text on a single line with two forward slashes ( // ).

Example:
#include <stdio.h>

int main()
{
    // This is a comment
    printf("Hello World");
	return 0;
}

The compiler will see the two forward slashes and immediately skip to the next line, because it knows that everything on that line will be a comment.

Multi-line comments

A multi-line comment can span a single line or multiple lines. A multi-line comment has an open and close tag that indicates to the compiler where the comment starts and ends.

To write an multi-line opening tag we use a single forward slash, followed by an asterisk symbol ( /* ). The write the closing tag we reverse the opening one and write an asterisk, followed by a forward slash.

Example:
#include <stdio.h>

int main()
{
    /* This comment can
     * span multiple lines
     * because it has open
     * and close tags
     */
    printf("Hello World");
	return 0;
}

The interpreter will skip anything written between the open and close and move on to the code below it.

As mentioned earlier, a multi-line comment can also be used on a single line. This is mostly used when we need to comment in between pieces of code.

Example:
#include <stdio.h>

int main()
{
    for (int i = 0; /* i <= 10 */ i < 10; i++)
    {
        printf("%d\n", i);
    }

	return 0;
}

In the example above, we comment out a piece of code between other code with the multi-line comment.

Typically, developers use single line comments instead of multi-line comments even if their comments will span multiple lines.

Example:
#include <stdio.h>

int main()
{
    // This comment can
    // span multiple lines
    // because it has open
    // and close tags
    printf("Hello World");
	return 0;
}

Comment shortcuts

Most IDEs usually allow us to use a shortcut to comment out a piece of code. Codeblocks has two different shortcuts, one to comment and another to uncomment.

To use this feature, we have to highlight a section of code before pressing the shortcut.

  • Comment: Ctrl + Shift + C on highlighted code.
  • Uncomment: Ctrl + Shift + X on highlighted code.

The Codeblocks Wiki has a full list of shortcuts if you want to speed up your development.

If you’re not using Codeblocks as your IDE, you may have to view its documentation to find the shortcut.

Summary: Points to remember

  • Keywords are reserved words in C that have a special meaning to the compiler
  • Special characters also have special meaning to the compiler but only in certain circumstances.
  • Identifiers are the names we give our data containers, functions, macros and structures.
    • We can choose snake, camel or pascal casing for our identifiers.
  • Statements are lines of code that the compiler can execute.
    • Statements are usually written only one per line.
    • Statements are typically terminated by a semicolon operator.
  • Scope is defined by a code block with open and close curly braces.
    • We visually indicate a code block by indenting once to keep the code clean and easily readable.
  • Comments are used to document our code.
    • Single line comments are prefixed with two forward slashes and are only valid on a single line.
    • Multi-line comments are wrapped in /* open and */ close tags and span multiple lines.
  • IDEs have shortcuts to easily comment out sections of code.
    • Codeblocks use Ctrl + Shift + C to comment and Ctrl + Shift + X to uncomment.
    • IDEs like Atom use a single shortcut to comment and uncomment, like Ctrl + /.