cs1ch11sec2.htm
 CHAPTER 11

INPUT/OUTPUT
A MORE DETAILED LOOK
 


Section II: Basic Input Using Instances of the Class istream Section I: The C++ Perspective on Input and Output Section III: Basic Output Using Instances of the Class ostream Section IV: File Handling
Section V : The Design and Analysis for a Contract Program Using Files Section VI: Modifying the Code for a Contract Program Using Files Section VII: Run Time Parameters - Passing Values to 'main'

  


Table of Contents

Learning C++:
An Index of Entry Points


2. The

of C++

A reference document on the basic elements of C++.



3. The Patterns



Index!



A. The '>>' Operator
As you know, 'cin' is an instance of the class 'istream'. We could declare our own additional instances of this class, but there rarely is a need to. This section, then, will first review the functionality we already know, related to input with 'cin', and then explore some additional elements.

Whenever we write

cin >> ..........

the bits being read from the input stream are interpreted in terms of the type of the variable on the right hand side. In all cases, the default behavior is to ignore leading blanks (ASCII character code 32) and then, when the first non-blank character appears on the stream, read data into the variable until the next blank is found. For example, if the variable is of type 'int' the system will skip all blanks and attempt to translate the codes that follow - up to the next blank - as digits in the range 0 - 9 (with the first character being an optional '+' or '-'). If a non-digit character appears - a letter or, for example, a comma, the input fails because it cannot translate this code into an integer value. When this happens, the cin stream has a status of 'fail'. (We will see the implications of this in a bit.)

Similar rules apply for doubles. However, with variables of type char, only one character is read, and it is much harder to have an input failure. Note that in both cases all leading white space (blanks) is again skipped. The same process occurs with strings, but here one has to be more careful. Since the system only stops removing characters from the stream when it encounters a blank, it is possible to overflow the memory set aside for the string. If this happens, the system will blithely continue reading characters into memory meant for other purposes until a blank is encountered. The consequences of this could be disastrous. One nice feature: when a trailing blank is encountered, a NULL character is automatically inserted into the string so the string manipulation operations will work as long as too many characters are not entered.

B. The 'get' Member Function
Since there is this danger of overflow and because it is not possible using the ">>" operator to read in a string with embedded blanks (the input stops at the first blank after the first non-blank character), it is often better to use the member functions 'get' or 'getline' when reading strings. The function 'get' actually has a number of different forms. The simplest form:

cin.get(some char variable type);

can be used to read in one character at a time. Since this function does not skip leading white space, it can be used to read in a blank. (One cannot do this with 'cin' because it always skips leading blank.) If blanks have significance as input, this could be very useful.

Earlier, we saw how functions can be overloaded - to take on more than one purpose - by using different types and/or number of parameters. The 'get' function has been overloaded a number of times. When called with, first, a string parameter and, second, an integer parameter, it can be used to read characters until it encounters a newline character ('\n') or until it has read "integer -1 characters. Thus, if we write:

cin.get(string, 5);

the system will read in up to four characters (stopping sooner if it encounters a newline character) AND add a NULL character. The programmer must make sure that 'string' is large enough to hold four characters and a NULL character but this is much safer than assuming that the user will not enter too many characters.

This form of 'get' actually has a third parameter. We don't need to use it because it has a default value of '\n'. Thus, when a call to 'get' does not provide a third parameter, the function performs as described - using the default value to indicate the character to look for to stop reading in input. However, if, for example, one wishes a program to read in possibly multiple lines of characters until 80 characters or a period (.) is entered, one could write:

cin.get(string1, 81, '.');


In this case if one entered the following (on three separate lines):

    Hello
    There
    Curtis.
The string would contain "Hello\nThere\nCurtis". (Yes, the newline characters are included.) Then, if the code also included the line:

cout << string1;

the output would be on three lines - because of the newline characters stored in the string!

Having learned about function overloading and parameter default values, we can imagine how the function 'get' is declared in the file "iostream.h". There are some obscure points about what are called 'far' pointers and 'unsigned characters' in the actual declarations that we will skip, but if you examine this file you will see three function declarations for 'get' that are relevant to us:

    int get();
    istream& get( char& );
    istream& get( char*, int, char = '\n');
Let's use these declarations both to learn more about input operations and as a review of what we know about function declarations. First, we need to understand the receives and returns. The first version is designed to return the next element on the stream or the symbol for end-of-file (EOF). Since characters really are integer codes and the code for EOF is '-1', this version of the function is declared to return an integer. Most C++ compilers allow one to return an integer and convert it automatically into a character as long as the integer is small enough - characters take up one byte while integers take up two or more. Some systems, however, do not allow negative numbers in variables of type char and you will see programs where variables, that are clearly meant to handle characters, are declared as int's, just in case an EOF or other such symbol is returned by a function.

The second and third versions of this function, for reasons that are unimportant here, return the modified stream. Notice that the address of the stream is returned (that is what the '&' means after the return type) so that a copy of the stream is not made. The second version also needs to return a character via its sole parameter thus it has the '&' after the parameter. We have not discussed this before but technically one need not put a variable name for each parameter in the declarations. All that is required is the parameter type. However, it seems safer and simpler to understand if we continue to provide parameter names in our declarations.

We have already discussed the third version. It returns a string, thus the first parameter is a 'char*'. We do not need the '&' symbol because the '*' means the address is being passed - which is what the '&' also indicates. The second parameter is the maximum length of the string, while the third parameter is the delimiter, the character at which input will stop if encountered before the maximum number of characters have been read. Note that this third parameter has a default type. In other words, as we already know, if the third parameter is dropped in a call to 'get', the delimiter will automatically be the newline (\n) symbol.

C. The 'getline' Member Function
Similar to 'get' is the 'getline' member function. The single declaration of importance to us is:

istream& getline( char*, int, char = '\n');

If called as :

cin.getline(string1, 81);

this function will read in up to 80 characters or until the newline character is encountered. There is no version of this function declared with simply a 'char' as the first parameter. Therefore, you cannot read data into a character type variable using 'getline' – as you can with the first two versions of 'get'. This makes sense because, as its name implies, 'getline' is meant to read in a line at a time. You can, however, define what a 'line' means to you by choosing to call this function with a third parameter, thus overriding the value of '\n' as the delimiter.

The main difference between 'getline' and 'get' is that 'getline' does remove the delimiter from the stream. As we saw in chapter 10, Section 8, when using the 'get' function, it is often necessary to also use the 'ignore' member function to remove the newline character before proceeding with the next 'get'. If this is not done, the next 'get' will also encounter the newline and will stop immediately. The 'getline' function does all this for us. With keyboard input, where typing the 'Enter' key is crucial to getting information processed, the difference between 'get' and 'getline' is not too important. One can either use 'get' and 'ignore' together or use 'getline' by itself. However, in processing files, the difference can be more significant.

D. The Use of the setw Manipulator
There are a number of function-like operators,called manipulators, that can be used to change the way the input and output operators ('>>' and '<<') work. We have already seen one of these - 'endl'. There is a second one we will explore briefly here in conjunction with input.

The manipulator 'setw' (short for set width) can be used to indicate the maximum number of characters to be read in with the '>>' operator. In other words, it is a way to use '>>' to read in strings and guarantee that the string array does not overflow. It does not, however, change the fact that '>>' stops when it encounters a blank. Thus you cannot use it to read strings that contain blanks.

Consider the following code:

    char string[5];
    cout << "Please enter a string\n";
    cin >> setw(5) >> string;
    cout << string << endl;
If a user enters the string "123456", only the string "1234" will be output. That is because 'setw(5)' says that only 4 characters will actually be read - with the fifth left for the infamous NULL character. The string will not overflow!

Now consider the same code with two lines added:

    char string[5];
    cout << "Please enter a string\n";
    cin >> setw(5) >> string;
    cout << string << endl;
    cin >> setw(5) >> string;
    cout << string;
Remember, the setw manipulator stopped the first input from reading more than four characters. Any left over characters are still on the stream. Thus, if we use the same input ("123456"), the second read here gets the last two characters, and the output will be:

1234
56

Note that 'setw' only affects the very next '>>' operation. Thus, it needs to be repeated each time you want to restrict the number of characters read. Also, to use it in a program, one must include the header file 'iomanip.h'.

E. Handling Invalid Input
By now it is likely that you have managed more than once to get yourself into an infinite loop by entering some invalid data. For example, if you have a loop that expects the user to continuously enter integers until some sentinel value is entered, but the user types a non-numeric character, the program will go into an infinite loop. Consider the following very simplistic code:

 
	int x;
	cout << "Enter a number\n";
	cin >> x;
	while (x != 0)
	{	cout << " A number was read\n";
		cout << "Enter a number\n";
		cin >> x;
	}

This code will cause such an infinite loop if the user types in non-numeric characters. (The symbols '+' and '-' are legal as the first character.) The reason has nothing to do with the quality of the code and every thing to do with the way C++ represents and handles streams. As noted earlier, associated with every stream variable such as 'cin' and 'cout', is a data structure representing the status of the input or output operations on the stream. Possible status values include:
  • good (everything is just fine)
  • EOF (an end of file code has been encountered)
  • fail (the last stream operation failed)
  • bad (a serious I/O error has occurred)

When, for example, the system attempts to read a non-numeric character into an integer variable, an error occurs that sets the stream status to 'fail'. Once this happens, any further request to read is ignored. Therefore, if a user types, for example, a 'Q' when requested for a number in the above code, the status flag is set to 'fail', the input is ignored, and the system returns to the top of the loop. Since 'x' does not yet equal 0, the loop again attempts to execute the line

cin >> x;

However, the instruction is ignored because the status flag has been set to 'fail'. This means that the value of 'x' has not changed and the loop repeats again. In fact, 'x' will never change because the status flag will always be 'fail' and the system will ignore the read instruction. We have an infinite loop!

To fix this we need to check for the 'fail' error status, clear the status flag if necessary, get rid of the illegal character and then continue. Below is a code fragment that demonstrates. It uses a 'while' loop that simply keeps asking for numbers until the user enters a 0. Note that the 'while' instruction itself is not part of the error handling. It is include because such error handling is needed when an input fails inside a loop.


int x;
char ch1;
cout << "Enter a number\n";
cin >> x;
while (x != 0)
{ 	if (cin.fail())
	{	cin.clear();
		cin >> ch1;
		cout << "Invalid value entered\n";

	}
	else
	{	cout << " A number was read\n";
	}
	cout << "Enter a number\n";
	cin >> x;
	}
}
The function call

cin.fail()

checks the error status of 'cin'. If an error has occurred, the code:

  • goes inside the if statement;
  • uses the function call "cin.clear();" to clear the error status, that is, the error status is reset to 'good';
  • reads the invalid data into a variable of type char - to get rid of the non-numeric data;
  • tells the user what happened

If an error has not occurred, this code simply announces that the read was a success. In a real application, the line

"cout << "A number was read\n"

would be replaced by whatever was to be done with valid input. In both cases (valid or invalid data), another character is read and the system starts again. Here you see another pattern (Pattern G1, "Clean Handling of Invalid Data") that you might want to consider adding to your code, whenever you are working with input that could be invalid.

By the way, you may have noticed that one of the so-called error states was 'EOF' or "end of file". Of, course, this is not really an error state. It represents the state of a stream when it has encountered the end of file character code. This code is the equivalent to striking "Ctrl-D" on your keyboard (hitting the CTRL key and the 'D' key simultaneously). Therefore, although it may seen strange, the 'cin' stream could be in the 'eof' state, if the user typed Ctrl-D and all input prior to this had been processed.

************************

That is all we will say here about input operators, functions, and manipulators. There are more ways to control input and more ways to accomplish the same things discussed here. For example, one can also use the member function 'width' to set the width of a string to be input as in:

    char string[5];
    cin.width(5);
    cin >> string;
    cout << string << endl;
If you want to familiarize yourself with the intricacies of input, you should explore the manual for your compiler or the other books we mentioned at the beginning of this chapter.

Topics Covered in the "Essentials of C++"

The Member Functions 'get' and 'getline'
Stream Error Codes
Testing and Clearing Error Codes
Width and Other Stream Data Members
Width and Other Stream Function Members
setw and Other Manipulators

Top of Section Main Menu Next Section