## Sunday, 28 June 2015

### LEX YACC program to evaluate arithmetic expression ( LEX and YACC programs for arithmetic expression evaluation )

In this post, I have given a LEX and YACC program to evaluate arithmetic expression.
(Note: If you want to check just validity of arithmetic expression (i.e. valid or invalid), go through this post: LEX YACC program to check / recognize valid Arithmetic Expression)
Here arithmetic expression may have operations like Addition(+), Subtraction(-), Multiplication(*), Division(/) or Modulus(%). Expression may contain balanced round brackets.
Explanation of the program, is given at the end of this post.
Go through the following programs:

Lex Program: (sample.l)

%{
#include<stdio.h>
#include "y.tab.h"
extern int yylval;
%}

%%
[0-9]+ {
yylval=atoi(yytext);
return NUMBER;
}
[\t] ;
[\n] return 0;
. return yytext;
%%
int yywrap()

{
return 1;
}

Yacc program: (sample.y)

%{
#include<stdio.h>
int flag=0;

%}
%token NUMBER

%left '+' '-'
%left '*' '/' '%'
%left '(' ')'
%%
ArithmeticExpression: E{
printf("\nResult=%d\n",\$\$);
return 0;
}
E:E'+'E {\$\$=\$1+\$3;}
|E'-'E {\$\$=\$1-\$3;}
|E'*'E {\$\$=\$1*\$3;}
|E'/'E {\$\$=\$1/\$3;}
|E'%'E {\$\$=\$1%\$3;}
|'('E')' {\$\$=\$2;}
| NUMBER {\$\$=\$1;}
;
%%

void main()
{
printf("\nEnter Any Arithmetic Expression which can have operations Addition, Subtraction, Multiplication, Divison, Modulus and Round brackets:\n");
yyparse();
if(flag==0)
printf("\nEntered arithmetic expression is Valid\n\n");

}
void yyerror()
{
printf("\nEntered arithmetic expression is Invalid\n\n");
flag=1;
}

How To Run:
>>>yacc -d sample.y
>>>lex sample.l
>>>gcc lex.yy.c y.tab.c
>>>./a.out

Output:

For Invalid Expressions

Explanation:
LEX
LEX program consists of three sections "Definitions", "Regular Expressions and action for each regular expression" and "Subroutines".

Definition section consists of C language code which involves header file inclusion, global variables declaration/definition etc. C language code can be mentioned in between the symbols %{ and %}

LEX requires regular expressions or patterns to identify token of lexemes. Examples of token are identifier, header file, constants etc. while Lexemes are the actual words used in your input program for e.g. printf, scanf, stdio.h etc. These regular expressions and action for them are mentioned in second section. When we call yylex() function, it starts the process of pattern matching. Lex keep the matched string into the address pointed by pointer yytext. Matched string's length is kept in yyleng while value of token is kept in variable yylval. In above program NUMBER is a token. atoi() is a standard C function used to convert string to integer value.

Third section consists of subroutines/functions. Lex call yywrap() function after input is over. It should return 1 when work is done or should return 0 when more processing is required. yylex() function actual starts the process of pattern matching. In above program, we have not called yylex() since yyparse() in yacc program automatically calls yylex(). Thats why there in no need to call yylex() separately. If you are writing standalone Lex program, then you have to call yylex() in main() function in Lex program.

YACC
YACC program also consists of three sections, "Definitions", "Context Free Grammar and action for each production", "Subroutines/Functions"

In first section, we can mention C language code which may consist of header files inclusion, global variables/ Constants definition/declaration. C language code can be mentioned in between the symbols %{ and %}.  Also we can define tokens in the first section. In above program, NUMBER is the token. We can define the associativity of the operations (i.e. left associativity or right associativity). In above yacc program, we have specified left associativity for all operators. Priorities among the operators can also be specified. It is in the increasing order "from top to bottom". For e.g. in our above yacc program, round brackets '(',')' has the higher priority than '*', '/', '%' which has higher priority than '+', '-'. Operators in the same statement have the same priority. For e.g. in our above program all of the '*', '/', '%' have the same priority.

In second section, we mention the grammar productions and the action for each production. \$\$ refer to the top of the stack position while \$1 for the first value, \$2 for the second value in the stack.

Third section consists of the subroutines. We have to call yyparse() to initiate the parsing process. yyerror() function is called when all productions in the grammar in second section do not match to the input statement.