Dec 2006
retval is your friendPaul Graham would say that hacking is similar to painting - and it is - but it is similar to painting in ways beyond method or technique. One of the items many hacker writers skip over is experience. This text deals with one really big item drawn from experience, why using return values (or internal signals if you will) is important. Using a signal based paradigm is not only a good idea, it makes life easier.
retstrings?Return strings, as proven by practically every piece of
interface documentation written - do not have to be strings.
Actually, a return value can in fact be whatever the programmer
wants the return to be regardless of the language. This is
important because some languages follow a return by sanity
clause. The Perl language returns
whatever the last operation results were - much like the shell. A
program written in C/C++ or Java can do the same thing, the only
difference is a few more hoops might need to be jumped through to
implement a non string return on a function that was designed to
return a string. On the flip-side, there is also the guaranteed
null return. If a string operation fails for whatever reason,
the program builds a real NULL string for the said
platform (which should be handled by glibc or
libc) and thus guarantees a proper NULL
return. The last and perhaps most effective is just - set a string
value that is an equivalent to a signal. Using an error string
value is literally, done all of the time. A set of predefined
string names within the local context of a program simply mean
something bad. In short, retval does not mean just
numbers.
void and no-ops
bad?Nope, one of the big beefs out there is Perl does insist on
returning something which - well just may not be wanted. If it
can be void then why should a programmer have to say
so? The reverse policy should be used instead, if an implicit
return is desired, then do so, if not - do not force people to
do so. One of the big upsides to the C language is how
retvals can be tossed out the window.
Usually, automatic returns do no harm. There are cases when input
buffering is being parsed by sub routines in Perl and
shell scripts that can make forced returns a problem, such as
buffer mangling, but in general - harmless. Using void
functions is completely legitimate since 9 times out of 10 they are
inconsequential.
There exist so many shell scripts that check system status that if one had a penny for each they may not be rich but they would be well to do. One of the dangers of shell scripting is in-the-box thinking, especially when on a team. In this case study there are two examples of a very simple shell script that exhibits two different behaviors.
The first script is the bad one, it acts only for itself and does nothing to help anyone else:
...
for i in $#; do
ssh $i uptime || logger -i -s "$0 could not reach ${i}"
done
...
exit 0
While syslog may be getting monitored somewhere,
what if one wanted to use this script
from another
script?
A better script might offer a few alternatives to just logging:
...
sshchk()
{
errors=0
for i in $#; do
ssh $i uptime
if [ $? -gt 0 ]; then
logger -i -s "SSH Connect error to ${i}"
errors=$((errors+1))
done
return $errors
}
...
exit $errors
Now at least the caller - shell or script - will know there was a problem.
It may even be permissible to say at least one check
failed
and do the following:
...
sshchk()
{
errors=0
for i in $#; do
ssh $i uptime
if [ $? -gt 0 ]; then
logger -i -s "SSH Connect error to ${i}"
errors=$((errors+1))
done
if [ "$error" -gt 0 ]; then
return 1
fi
}
...
In which case, something went wrong it just not known how many times.
Many ideas in any programming language first come from inline
testing. In line testing is a nice way of saying cram it in
using
A good example of using returns in C
is when if then else is not needed or may not even be
applicable.ifdefs.
In the example below, using pseudocode, a mount point is being checked and then an additional check is added, error on group or world readable. It does not matter if it is C, Perl or Java - the idea is the same. One version checks within the existing body of code while another, very succinctly, does not.
if MOUNTPOINT
return 0
endif
if MOUNTPOINT
if EXEC
return 1
if WORLDREAD
return 1
if GROUPREAD
return 1
return 0
retval to save the day...
if MOUNTPOINT
retval=check_mount_perms
return retval
...
check_mount_perms
if EXEC
return 1
if WORLDREAD
return 1
if GROUPREAD
return 1
return 0
...
So what does the additional function do? It does two things, one
it takes the complexity of the check out of the simple mount point
check. Next, it offers the ability to add or alter the checks on an
as needed basis. What if checking for SGID or
GGID were needed? What if just group and
user were needed? In the latter version, such checks
can be added without obfuscating the mount check too much.
What if more is needed? It is much easier to do this:
if MOUNTPOINT
retval=check_mount_perms
return retval
...
check_mount_perms
if EXEC
return 1
if WORLDREAD
return 1
if GROUPREAD
return 1
if GGID != MYGROUP
return 1
return 0
...
Versus adding yet another check in the main calling program.
The question of how much message passing versus actually doing
work is as old as computational devices. There is no simple answer.
The best answer, and I yet again refer to Paul Graham - is do what seems
natural. Just keep in mind that well formed returns, whether they
are strings, numbers or NULL are ultimately up to the
writer - not the machine.
As mentioned earlier there are times when a program does not need to return a value, so when might that be? One good example is just information, the classic usage:
void usage(void) {
printf("Here is some info...",\n");
}
always applies everywhere. A simple echo command etc. is just fine, even proven known operations, like flag checking, work great without needing explicit return values.
Return values help users and programmers alike every day. Making prudent use of them as a shell scripter, Perl monger or C programmer just makes it easier for all of us. The key part to remember is judicious use, if a check seems intrusive - just push it into a module and return something - otherwise just try to do the right thing.