- In 'readutf8esc' (llex.c), the overflow check must be done before
shifting the accumulator. It was working because tests were using
64-bit longs. Failed with 32-bit longs.
- In OP_FORPREP (lvm.c), avoid negating an unsigned value. Visual
Studio gives a warning for that operation, despite being well
defined in ISO C.
- In 'luaV_execute' (lvm.c), 'cond' can be defined only when needed,
like all other variables.
When calling metamethods for things like 'a < 3.0', which generates
the opcode OP_LTI, the C register tells that the operand was
converted to an integer, so that it can be corrected to float when
calling a metamethod.
This commit also includes some other stuff:
- file 'onelua.c' added to the project
- opcode OP_PREPVARARG renamed to OP_VARARGPREP
- comparison opcodes rewritten through macros
Changed some implementation details; in particular, it is back using
an internal variable to keep the index, with the control variable
being only a copy of that internal variable. (The direct use of
the control variable demands a check of its type for each access,
which offsets the gains from the use of a single variable.)
- LUAC_VERSION is equal to LUA_VERSION_NUM, and it is stored
as an int.
- 'sizeof(int)' and 'sizeof(size_t)' removed from the header, as
the binary format does not depend on these sizes. (It uses its
own serialization for unsigned integer values.)
The numerical 'for' loop over integers now uses a precomputed counter
to control its number of iteractions. This change eliminates several
weird cases caused by overflows (wrap-around) in the control variable.
(It also ensures that every integer loop halts.)
Also, the special opcodes for the usual case of step==1 were removed.
(The new code is already somewhat complex for the usual case,
but efficient.)
All UTF-8 encoding functionality (including the escape
sequence '\u') accepts all values from the original UTF-8
specification (with sequences of up to six bytes).
By default, the decoding functions in the UTF-8 library do not
accept invalid Unicode code points, such as surrogates. A new
parameter 'nonstrict' makes them accept all code points up to
(2^31)-1, as in the original UTF-8 specification.
- The warning functions get an extra parameter that tells whether
message is to be continued (instead of using end-of-lines as a signal).
- The user data for the warning function is a regular value, instead
of a writable slot inside the Lua state.
When called with no arguments, 'math.randomseed' uses time and ASLR
to generate a somewhat random seed. the initial seed when Lua starts
is generated this way.
Removed code to ensure that strings inside Lua (as returned by
'lua_tolstring') always start in fully aligned addresses.
Since version 5.3 the documentation does not ensure that.
Several small improvements (code style, warnings, comments, more tests),
in particular:
- 'lua_topointer' extended to handle strings
- raises an error in 'string.format("%10q")' ('%q' with modifiers)
- in the manual for 'string.format', the term "option" replaced by
"conversion specifier" (the term used by the C standard)
After a major bad collection (one that collects too few objects),
next collection will be major again. In that case, avoid switching
back to generational mode (as it will have to switch again to
incremental to do next major collection).
The function 'string.gmatch' now has an optional 'init' argument,
similar to 'string.find' and 'string.match'. Moreover, there was
some reorganization in the manipulation of indices in the string
library.
This commit also includes small janitorial work in the manual
and in comments in the interpreter loop.
To-be-closed variables must contain objects with '__toclose'
metamethods (or nil). Functions were removed for several reasons:
* Functions interact badly with sandboxes. If a sandbox raises
an error to interrupt a script, a to-be-closed function still
can hijack control and continue running arbitrary sandboxed code.
* Functions interact badly with coroutines. If a coroutine yields
and is never resumed again, its to-be-closed functions will never
run. To-be-closed objects, on the other hand, will still be closed,
provided they have appropriate finalizers.
* If you really need a function, it is easy to create a dummy
object to run that function in its '__toclose' metamethod.
This comit also adds closing of variables in case of panic.
* unification of the 'nny' and 'nCcalls' counters;
* external C functions ('lua_CFunction') count more "slots" in
the C stack (to allow for their possible use of buffers)
* added a new test script specific for C-stack overflows. (Most
of those tests were already present, but concentrating them
in a single script easies the task of checking whether
'LUAI_MAXCCALLS' is adequate in a system.)
The script 'all', to run all tests, automatically ensures that the
Lua interpreter and the test C libraries (in 'testes/libs/') are
updated with any changes in 'luaconf.h'.
This file is not part of the regular tests. It tests error conditions
that demand too much memory or too much time to create:
* string with too many characters
* control structure with body too large
* chunk with too many lines
* identifier with too many characters
* chunks with too many instructions
* function with too many constants
* too many strings internalized
* table with too many entries
In machines with limited memory (less than 150 GB), many tests run up
to a "not enough memory" error. We need some memory (~256 GB) to
run all tests up to their intrinsic limits.
A long bracket with too many equal signs can overflow the 'int' used for
the counting and some arithmetic done on the value. Changing the counter
to 'size_t' avoids that. (Because what is counted goes to a buffer, an
overflow in the counter will first raise a buffer-overflow error.)
New functions to reset/kill a thread/coroutine, mainly (only?) to
close any pending to-be-closed variable. ('lua_resetthread' also
allows a thread to be reused...)
- in 'luaB_tonumber', do not need to "checkany" when argument
is a number.
- in 'lua_resume', the call to 'luaD_rawrunprotected' cannot return
a status equal to -1.
The call 'math.rand()' converts the higher bits of the internal unsigned
integer random to a float, instead of its lower bits. That ensures that
Lua compiled with different float precisions always generates equal (up
to the available precision) random numbers when given the same seed.
New auxiliary functions/macros 'luaL_argexpected'/'luaL_typeerror'
ease the creation of error messages such as
bad argument #2 to 'setmetatable' (nil or table expected, got boolean)
(The novelty being the "got boolean" part...)
A to-be-closed variable must be closed when a block ends, so even
a 'return foo()' cannot directly returns the results of 'foo'; the
function must close the scope before returning.
It is an error for a to-be-closed variable to have a non-closable
non-nil value when it is being closed. This situation does not seem to
be useful and often hints to an error. (Particularly in the C API, it is
easy to change a to-be-closed index by mistake.)
To remove a to-be-closed variable from the stack in the C API a
function must use 'lua_settop' or 'lua_pop'. Previous implementation of
'luaL_pushresult' was not closing the box. (This commit also added
tests to check that box is being closed "as soon as possible".)
(Long time without testing with '-DHARDSTACKTESTS'...)
With the introduction of to-be-closed variables, calls to 'luaF_close'
can move the stack, but some call sites where keeping pointers to the
stack without correcting them.
Added opcodes for all seven arithmetic operators with K operands
(that is, operands that are numbers in the array of constants of
the function). They cover the cases of constant float operands
(e.g., 'x + .0.0', 'x^0.5') and large integer operands (e.g.,
'x % 10000').
The visibility for functions marked as LUAI_FUNC was changed from
"hidden" to "internal". These functions cannot be called from
outside the Lua kernel, and "internal" visibility offers more
chances for optimizations.
Sometimes it is useful to mark to-be-closed an index that is not
at the top of the stack (e.g., if the value to be closed came from
a function call returning multiple values).
The variable to be closed in a generic 'for' loop now is the
4th value produced in the loop initialization, instead of being
the loop state (the 2nd value produced). That allows a loop to
use a state with a '__toclose' metamethod but do not close it.
(As an example, 'f:lines()' might use the file 'f' as a state
for the loop, but it should not close the file when the loop ends.)
The new syntax is <local *toclose x = f()>. The mark '*' allows other
attributes to be added later without the need of new keywords; it
also allows better error messages. The API function was also renamed
('lua_tobeclosed' -> 'lua_toclose').
The repetitive code of the arithmetic and bitwise operators in
the main iterpreter loop was moved to appropriate macros.
(As a detail, the function 'luaV_div' was renamed 'luaV_idiv',
as it does an "integer division" (floor division).
The mechanism of "caching the last closure created for a prototype to
try to reuse it the next time a closure for that prototype is created"
was removed. There are several reasons:
- It is hard to find a natural example where this cache has a measurable
impact on performance.
- Programmers already perceive closure creation as something slow,
so they tend to avoid it inside hot paths. (Any case where the cache
could reuse a closure can be rewritten predefining the closure in some
variable and using that variable.)
- The implementation was somewhat complex, due to a bad interaction
with the generational collector. (Typically, new closures are new,
while prototypes are old. So, the cache breaks the invariant that
old objects should not point to new ones.)
The implicit variable 'state' in a generic 'for' is marked as a
to-be-closed variable, so that the state will be closed as soon
as the loop ends, no matter how.
Taking advantage of this new facility, the call 'io.lines(filename)'
now returns the open file as a second result. Therefore,
an iteraction like 'for l in io.lines(name)...' will close the
file even when the loop ends with a break or an error.
"Better" and similar to error messages for invalid function arguments.
*old message: 'for' limit must be a number
*new message: bad 'for' limit (number expected, got table)
Statements like 'if cond then goto label' generate code so that the
jump in the 'if' goes directly to the given label. This optimization
cannot be done when the jump is backwards leaving the scope of some
variable, as it cannot add the needed 'close' instruction. (The jumps
were already generated by the 'if'.)
This commit also added 'likely'/'unlikely' for tests for errors in
the parser, and it changed the way breaks outside loops are detected.
(Now they are detected like other goto's with undefined labels.)
Added restriction that, when a label is created, there cannot be
another label with the same name visible. That allows backward goto's
to be resolved when they are read. Backward goto's get a close if
they jump out of the scope of some variable; labels get a close only
if previous goto to it jumps out of the scope of some upvalue.
Added new instruction 'OP_TFORPREP' to prepare a generic for loop.
Currently it is equivalent to a jump (but with a format 'iABx',
similar to other for-loop preparing instructions), but soon it will
be the place to create upvalues for closing loop states.