The robust rendering of estra large glyphs came with unbearable cost.
The old way of bisecting should fail but fail faster.
* src/smooth/ftgrays.c (gray_convert_glyph): Switch back to bisecting
in y-direction.
*/* [FT_CONFIG_OPTION_PIC]: Remove all code guarded by this
preprocessor symbol.
*/*: Replace `XXX_GET' macros (which could be either a function in
PIC mode or an array in non-PIC mode) with `xxx' arrays.
* include/freetype/internal/ftpic.h, src/autofit/afpic.c,
src/autofit/afpic.h, src/base/basepic.c, src/base/basepic.h,
src/base/ftpic.c, src/cff/cffpic.c, src/cff/cffpic.h,
src/pshinter/pshpic.c, src/pshinter/pshpic.h, src/psnames/pspic.c,
src/psnames/pspic.h, src/raster/rastpic.c, src/raster/rastpic.h,
src/sfnt/sfntpic.c, src/sfnt/sfntpic.h, src/smooth/ftspic.c,
src/smooth/ftspic.h, src/truetype/ttpic.c, src/truetype/ttpic.h:
Removed.
We used to split large glyphs into horizontal bands and continue
bisecting them still horizontally if that was not enough. This is
guaranteed to fail when a single scanline cannot fit into the
rendering memory pool. Now we bisect the bands vertically so that
the smallest unit is a column of the band height, which is guranteed
to fit into memory.
* src/smooth/ftgrays.c (gray_convert_glyph): Implement it.
At large sizes almost but not exactly horizontal segments can quickly
drain the rendering pool. This patch at least avoids filling the pool
with trivial cells. Beyond this, we can only increase the pool size.
Reported, analyzed, and tested by Colin Fahey.
* src/smooth/ftgrays.c (gray_set_cell): Do not record trivial cells.
* src/smooth/ftgrays.c [STANDALONE] (OVERFLOW_ADD_LONG,
OVERFLOW_SUB_LONG, OVERFLOW_MUL_LONG, NEG_LONG): New macros.
[!STANDALONE]: Include FT_INTERNAL_CALC_H.
(gray_render_cubic): Use those macros where appropriate.
We don't need some divisions if a line segments stays within a single
row or a single column of pixels.
* src/smooth/ftgrays.c (gray_render_line) [FT_LONG64]: Make divisions
conditional.
The algorithm calls `gray_set_cell' at the start of each new contour
or when the contours cross the cell boundaries. Double-checking for
that is wasteful.
* src/smooth/ftgrays.c (gray_set_cell): Remove check for a new cell.
(gray_convert_glyph): Remove initialization introduced by 44b172e88.
* src/smooth/ftgrays.c (gray_move_to): Call `gray_set_cell' directly
instead of...
(gray_start_cell): ... this function, which is removed.
(gray_convert_glyph): Make initial y-coordinate invalid.
It turns out that there is significant cost associated with `FT_Span'
creation and calls to `gray_render_span' because it happerns so
frequently. This removes these steps from our internal use but leaves
it alone for `FT_RASTER_FLAG_DIRECT" to preserve API. The speed gain
is about 5%.
* src/smooth/ftgrays.c (gray_render_span): Removed. The code is
migrated to...
(gray_hline): ... here.
Zero coverage is unlikely (1 out of 256) to warrant checking. This
gives 0.5% speed improvement in dendering simple glyphs.
* src/smooth/ftgrays.c (gray_hline, gray_render_span): Remove checks.
This gives 2% speed improvement in rendering simple glyphs.
* src/smooth/ftgrays.c (TPixmap): Reduced pixmap descriptor with a
pointer to its bottom-left and pitch to be used in...
(gray_TWorker): ... here.
(gray_render_span): Move pixmap flow check from here...
(gray_raster_render): .. to here.
This removes unnecessary complexity of span merging and buffering.
Instead, the spans are rendered as they come, speeding up the
rendering by about 5% percents as a result.
* src/smooth/ftgrays.c [FT_MAX_GRAY_SPANS]: Macro removed.
(gray_TWorker): Remove span buffer and related fields.
(gray_sweep, gray_hline): Updated.
* include/freetype/ftimage.h: Remove documentation note about
`FT_MAX_GRAY_SPANS', which was never in `ftoption.h' and is now gone.
Rasterization sub-banding is utilized at large sizes while using
rather small fixed memory pool. Indeed it is possible to make an
educated guess how much memory is necessary at a given size for a
given glyph. It turns out that, for large majority of European glyphs,
you should store about 8 times more boundary pixels than their height.
Or, vice versa, if your memory pool can hold 800 pixels the band
height should be 100 and you should sub-band anything larger than
that. Should you still run out of memory, FreeType bisects the band
but you have wasted some time. This is what has been implemented in
FreeType since the beginning.
It was overlooked, however, that the top band could grow to twice the
default band size leading to unnecessary memory overflows there. This
commit fixes that. Now the bands are distributed more evenly and
cannot exceed the default size.
Now the magic number 8 is really suitable for rather simple European
scripts. For complex Chinese logograms the magic number should be 13
but that is subject for another day.
* src/smooth/ftgrays.c (gray_convert_glyph): Revise sub-banding
protocol.
* src/smooth/ftgrays.c (TArea): Restore original definition as `int'.
(gray_render_line) [FT_LONG64]: Updated.
(gray_convert_glyph): 32-bit band bisection stack should be 32 bands.
(gray_convert_glyph_inner): Trace successes and failures.
This patch restores original `TCoord' definition as `int' so that the
rendering pool is used more efficiently on LP64 platforms (unix).
* src/smooth/ftgrays.c (gray_TWorker, TCell, gray_TBand): Switch some
fields to `TCoord'.
(gray_find_cell, gray_render_scanline, gray_render_line, gray_hline,
gray_sweep, gray_convert_glyph): Updated.