Chapter 1: Introduction
"A journey of a thousand miles begins with a single step."
-Chinese Proverb, Tao Te Ching
In This Chapter:
The purpose of this introductory chapter is to expose you to some of the vocabulary and terms used in the textbook as well as a brief overview of Programming Languages. These keywords are in Bold Font if these keywords are unfamiliar to you at first, that is okay. It is not important that you memorize the definitions of these keywords when you first see them, as they will be defined and used throughout the textbook. In addition, this chapter covers the history and evolution of programming languages starting from the early programming in the 1940's to today.
You will learn:
- What is a computer programming language?
- Syntax vs. Semantics
- History of languages
- Evolution of languages
- Modern refinement of languages
- What makes a programming language good or bad?
- Creating languages
- Why study programming languages
- Who uses programming languages
What Is A Programming Language?
A computer programming language is a formal language, similar to natural languages, such as English and French. Programming languages are used to create and implement specific algorithms that produce various kinds of output. A programming language expresses a set of instructions to a digital computer and those instructions are used to create a program that implements specific sets of tasks.
Formal Language Definition:
All programming languages are formal languages. A formal language has two defining characteristics:
- Syntax: a precise set of rules that determine the structure of statements, allowed symbols, and the combination of legal expressions. An easy to understand example would be how young children write 1+2=3, this expression is easy for humans to understand, but a computer may require it to be written as 3=2+1. Syntax defines the rules in which the operators are used in this expression.
- Semantics: a precise set of rules that define the values of the symbols and legal expressions. Similar to the above example; Semantics define what the operators and digits mean in terms of value. You could create a language where the addition symbol + actually tells the machine to multiply.
The TIOBI Index analyzes the popularity of 2,000 programming languages. The TIOBE Index defines a programming language by the following criteria:
- The programming language is moderated on Wikipedia.
- Be Turing complete, meaning, able to compute any calculation that a programmable computer can.
- Minimum 5,000 hits for "<language> programming" for Google.
The TIOBE Index ratings are calculated by the number of hits on popular search engines around the world. In March 2018, the TIOBE Index determined Java as the most widely-used programming language in the world, followed by C and C++5.
What Makes A Good Programming Language?
A programming language can be classified as good based on its mode of development, usability and execution efficiency. A programming language is what you will use to write a computer program. A good programming language can lead you to the correct result quickly, and in a naturally and easily manner. A bad language might add so much complexity that you abandon the attempt and move on to try another approach. Some of the criteria of a good programming language are highlighted below:
- Program Design - deciding how to implement the flow of logic and how the data are to be handled. A good programming language should have a proper structure and well-defined semantics that makes it easy to reason about what your program will do.
- Simplicity - being able to easily explain and communicate the syntax, structure and semantics to others.
- Correctness - making sure that the program is correct and will produce the expected results.
- Expressiveness - less is more meaning shorter code can yield effective performance.
- Readability - programs written in the language are easy to read and comprehend. Spending time trying to understand a language could be discouraging.
- Security - we all want at some point to write secure programs. A good programming language should help enforce security.
- Modularity - a good programming language should allow you separate your program into different modules which could interfere with themselves. Modular programming allows manageability and code re-usability.
- Error Handling - being able to detect at run-time when a data structure is storing information about exceptional conditions.
- Documentation - a good programming language should have documentation that helps users know how to use every function.
How Programming Languages Are Made: (From a Software Perspective)
Just like human languages, each programming language is made up of tokens and grammar. Token deals with the lexical part of a language while Grammars deals with the syntactic part of a language.
Tokens are sets of symbols or strings with meanings and they form together to build up the language. Tokens are usually defined using name, value pairs. A token name is a category of symbols, character or set of characters (words). These symbols or characters make up the token values.
Some common token name and values used are:
|Token Name||Token values|
|Comment token||//comment, /*comment*/|
Generally, values that are assigned to the same token name should exhibit the same behavior. This means it would be awkward having the if keyword as a operator. Operators should be symbols that operate on arguments and produces results, such as "+","-","/","*", etc.
Tokens are covered in more details in Chapter 10 of the textbook.
Grammars are set of rules that define the syntax or arrangement of tokens to provide meaningful phrases of the programming language. In programming languages, grammars could define aspects like "What makes an expression or a statement valid". Grammars are covered in more details in Chapter 9 of the textbook.
In this section, we give a brief summary about how tokens and grammars are generated and compiled together.
Tokens are generated with a tool known as a Lexical Analyzer (Lex). Lex reads input stream and converts it to source code in C programming language (.c file). The extension for lex files is .l. There are several advantages of using lex, come including the fact that it's faster and that it handles error. Lex files are divided into three sections using two percent signs (%%): the definition section for defining macros and importing headers, the rules section for associating regular expression with C statements and the C code section which contains C code. 31 Lex is covered in more details in Chapter 12 of the textbook.
Once we have tokens defined, the grammars are generated using a tool called yacc (Yet Another Compiler-Compiler). The generated .c file from lex is taken and parsed to a parser (phrase analyzer) written in yacc. This produces a tree of nodes and it's what determines if sequence of tokens follow the rules of the grammar and takes predefined actions including produces syntax errors when tokens don't match rules. Yacc is covered in more details in Chapter 12 of the textbook.
The last part involves interpreting or compiling the language and determining runtime configuration.
Benefits of Programming Languages
Why Study Programming Languages?
Studying programming languages will increase your productivity and success at your job. You will learn to comprehend the benefits and hindrance of a language based on the project at hand. While learning programming languages, you will provide the following assets to your team:11
- An Increased capacity to express programming concepts. These concepts are present in the majority of computer programming languages and are fundamental to the programming process.
- Algorithms - a set of instructions designed to perform a specific task. Algorithms are often created as functions that serve as small programs that can be referenced by a larger program12
- Array - a list of related values18.
- Class - a set of instructions to build a specific type of object21.
- Compiler - a software program that translates source code of high-level programming language into a low-level object code (binary code) in machine language14.
- Conditional - an expression that evaluates to either true or false to determine the flow through if and while statements17.
- Datatype - tells what kind of data that value can have15.
- Function - a module of code that performs a specific task, usually taking in data, processing it, and returning a result20.
- Loop - a function that iterates through a statement until the statement becomes false19.
- Source Code - the set of instructions and statements written in a programming language. The source code will contain declarations, instructions, functions, loops and other statements, which act as instructions for the program13.
- Variable - a suggestive name that represents a value to be used independently of the information it represents16.
- An improved background for choosing appropriate languages. Understanding the degree of complexity, flaws, documentation, ease and flexibility of use, and the technical characteristics of different programming languages will allow you as an employee to provide insight for the company you are working for to utilize the correct amount of resources, whether that be time or money1b.
- Increased ability to learn new languages. As you saw above, the increased changes in programming languages and the addition of new types of programming languages calls for the utmost importance on your transitional basis of executing projects in the optimal language.
- Overall advancement of computing 22.
Who Uses Programming Languages?
Manufacturers and engineers have traditionally utilized FORTRAN, or formulated translation. Released to the public in 1957, FORTRAN was a digital code interpreter, designed to approximate human language and could guarantee reasonable compatibility between different computer systems24. In the 1970's the C language portability matched FORTRAN and is one of the most common programming language used in engineering. C is used in mechanical engineering as it is commonly used for data acquisition and real-time robotic control. C is also used in more than 90% of desktop computer programs, from operating systems to word processors25.
Programming languages can also be used in human sciences, specifically in chemistry and biology. A new study published in Wiley VCH showcases molecular informatics. The study showcases the ability to store and process information using molecules. The computer-assisted strategy would be used in the fields of drug discovery and chemical biology, protein and nucleic acid engineering and design, the design of nanomolecular structures, strategies for modeling of macromolecular assemblies, molecular networks and systems, pharmaco- and chemogenomics 26.
1. Explain the difference between syntax and semantics.
2. Who is credited for the first computer programming language? What year was it created?
3. How does machine language differ from assembly language?
4. What makes a good programming language?
5. What are tokens? How are they defined?
6. What are grammars?
7. Define source code. What does it contain?
8. What is logic programming?
- Algorithms: is an algorithm is a set of instructions designed to perform a specific task. Algorithms are often created as functions that serve as small programs that can be referenced by a larger program.
- Array: is a list of related values.
- Array Programming: which is also known as vector or multi-dimensional programming languages, perform operations on scalars to apply transparently to vectors, matrices, and higher-dimensional arrays
- Class: is a set of instructions to build a specific type of object.
- Conditional: is an expression that evaluates to either true or false to determine the flow through if and while statements.
- Compiler: is a software program that translates source code of high-level programming language into a low-level object code (binary code) in machine language.
- Computer Programming Language: expresses a set of instructions for a digital computer.
- Datatype: tells what kind of data that value can have.
- Formal Language: abides by the rules of syntax and semantics.
- Function: is a module of code that performs a specific task, usually taking in data, processing it, and returning a result.
- Functional Programming: is a declarative programming paradigm that treats the execution of code as if it is the evaluation of mathematical functions, while avoiding changing the program’s state or mutable data.
- Grammars: are set of rules that define the syntax or arrangement of tokens to provide meaningful phrases of the programming language.
- Lexical Analyzer: is also known as, Lex. Lex reads input stream and converts it to source code in C programming language (.c file).
- Logic Programming: is a programming paradigm that is largely based on formal logic. A program that is written in a logic programming language is simply a set of statements in logical form, expressing facts and rules about a problem domain.
- Loop: is a function that iterates through a statement until the statement becomes false.
- Natural Language: is the language used in everyday conversations with humans.
- Object-Oriented Programming (OOP): is based on the principle of defining “objects”, which are like houses in this analogy, through “classes”, which are like the blueprints for the houses.
- Paradigm: the style of building the structure and elements of a computer program.
- Procedural Programming: is derived from structured programming, is based upon the computer programming concept of the procedure call.
- Programming Language Specification: ALGOL became a model for how later language specifications were written.
- Semantics: a precise set of rules that tell you the meanings of the symbols and legal expressions.
- Side Effects: changes in state that do not depend on the function inputs.
- Source code: is the set of instructions and statements written in a programming language. The source code will contain declarations, instructions, functions, loops and other statements, which act as instructions for the program.
- Syntax: a precise set of rules that determine the structure of statements, allowed symbols, and the combination of legal expressions.
- System Programming: is the activity of programming a computer’s operating system software.
- TIOBE Index: ranks the popularity of programming languages.
- Tokens: are sets of symbols or strings with meanings and they form together to build up the language.
- Turing Complete: are able to compute any calculation that a programmable computer can.
- Variable: is a suggestive name that represents a value to be used independently of the information it represents.