lib: performance improvement on readline async iterator
Using a direct approach to create the readline async iterator
allowed an iteration over 20 to 58% faster.
**BREAKING CHANGE**: With that change, the async iteterator
obtained from the readline interface doesn't have the
property "stream" any longer. This happened because it's no
longer created through a Readable, instead, the async
iterator is created directly from the events of the readline
interface instance, so, if anyone is using that property,
this change will break their code.
Also, the Readable added a backpressure control that is
fairly compensated by the use of FixedQueue + monitoring
its size. This control wasn't really precise with readline
before, though, because it only pauses the reading of the
original stream, but the lines generated from the last
message received from it was still emitted. For example:
if the readable was paused at 1000 messages but the last one
received generated 10k lines, but no further messages were
emitted again until the queue was lower than the readable
highWaterMark. A similar behavior still happens with the
new implementation, but the highWaterMark used is fixed: 1024,
and the original stream is resumed again only after the queue
is cleared.
Before making that change, I created a package implementing
the same concept used here to validate it. You can find it
[here](https://github.com/Farenheith/faster-readline-iterator)
if this helps anyhow.
PR-URL: https://github.com/nodejs/node/pull/41276
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
2022-10-24 20:49:16 +08:00
|
|
|
'use strict';
|
|
|
|
const common = require('../common.js');
|
|
|
|
const readline = require('readline');
|
|
|
|
const { Readable } = require('stream');
|
|
|
|
|
|
|
|
const bench = common.createBenchmark(main, {
|
|
|
|
n: [1e1, 1e2, 1e3, 1e4, 1e5, 1e6],
|
2023-09-23 16:17:14 +08:00
|
|
|
type: ['old', 'new'],
|
lib: performance improvement on readline async iterator
Using a direct approach to create the readline async iterator
allowed an iteration over 20 to 58% faster.
**BREAKING CHANGE**: With that change, the async iteterator
obtained from the readline interface doesn't have the
property "stream" any longer. This happened because it's no
longer created through a Readable, instead, the async
iterator is created directly from the events of the readline
interface instance, so, if anyone is using that property,
this change will break their code.
Also, the Readable added a backpressure control that is
fairly compensated by the use of FixedQueue + monitoring
its size. This control wasn't really precise with readline
before, though, because it only pauses the reading of the
original stream, but the lines generated from the last
message received from it was still emitted. For example:
if the readable was paused at 1000 messages but the last one
received generated 10k lines, but no further messages were
emitted again until the queue was lower than the readable
highWaterMark. A similar behavior still happens with the
new implementation, but the highWaterMark used is fixed: 1024,
and the original stream is resumed again only after the queue
is cleared.
Before making that change, I created a package implementing
the same concept used here to validate it. You can find it
[here](https://github.com/Farenheith/faster-readline-iterator)
if this helps anyhow.
PR-URL: https://github.com/nodejs/node/pull/41276
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
2022-10-24 20:49:16 +08:00
|
|
|
});
|
|
|
|
|
|
|
|
const loremIpsum = `Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed
|
|
|
|
do eiusmod tempor incididunt ut labore et dolore magna aliqua.
|
|
|
|
Dui accumsan sit amet nulla facilisi morbi tempus iaculis urna.
|
|
|
|
Eget dolor morbi non arcu risus quis varius quam quisque.
|
|
|
|
Lacus viverra vitae congue eu consequat ac felis donec.
|
|
|
|
Amet porttitor eget dolor morbi non arcu.
|
|
|
|
Velit ut tortor pretium viverra suspendisse.
|
|
|
|
Mauris nunc congue nisi vitae suscipit tellus.
|
|
|
|
Amet nisl suscipit adipiscing bibendum est ultricies integer.
|
|
|
|
Sit amet dictum sit amet justo donec enim diam.
|
|
|
|
Condimentum mattis pellentesque id nibh tortor id aliquet lectus proin.
|
|
|
|
Diam in arcu cursus euismod quis viverra nibh.
|
|
|
|
Rest of line`;
|
|
|
|
|
2023-09-23 16:17:14 +08:00
|
|
|
function oldWay() {
|
|
|
|
const readable = new Readable({
|
|
|
|
objectMode: true,
|
|
|
|
read: () => {
|
|
|
|
this.resume();
|
|
|
|
},
|
|
|
|
destroy: (err, cb) => {
|
|
|
|
this.off('line', lineListener);
|
|
|
|
this.off('close', closeListener);
|
|
|
|
this.close();
|
|
|
|
cb(err);
|
|
|
|
},
|
|
|
|
});
|
|
|
|
const lineListener = (input) => {
|
|
|
|
if (!readable.push(input)) {
|
|
|
|
// TODO(rexagod): drain to resume flow
|
|
|
|
this.pause();
|
|
|
|
}
|
|
|
|
};
|
|
|
|
const closeListener = () => {
|
|
|
|
readable.push(null);
|
|
|
|
};
|
|
|
|
const errorListener = (err) => {
|
|
|
|
readable.destroy(err);
|
|
|
|
};
|
|
|
|
this.on('error', errorListener);
|
|
|
|
this.on('line', lineListener);
|
|
|
|
this.on('close', closeListener);
|
|
|
|
return readable[Symbol.asyncIterator]();
|
|
|
|
}
|
|
|
|
|
lib: performance improvement on readline async iterator
Using a direct approach to create the readline async iterator
allowed an iteration over 20 to 58% faster.
**BREAKING CHANGE**: With that change, the async iteterator
obtained from the readline interface doesn't have the
property "stream" any longer. This happened because it's no
longer created through a Readable, instead, the async
iterator is created directly from the events of the readline
interface instance, so, if anyone is using that property,
this change will break their code.
Also, the Readable added a backpressure control that is
fairly compensated by the use of FixedQueue + monitoring
its size. This control wasn't really precise with readline
before, though, because it only pauses the reading of the
original stream, but the lines generated from the last
message received from it was still emitted. For example:
if the readable was paused at 1000 messages but the last one
received generated 10k lines, but no further messages were
emitted again until the queue was lower than the readable
highWaterMark. A similar behavior still happens with the
new implementation, but the highWaterMark used is fixed: 1024,
and the original stream is resumed again only after the queue
is cleared.
Before making that change, I created a package implementing
the same concept used here to validate it. You can find it
[here](https://github.com/Farenheith/faster-readline-iterator)
if this helps anyhow.
PR-URL: https://github.com/nodejs/node/pull/41276
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
2022-10-24 20:49:16 +08:00
|
|
|
function getLoremIpsumStream(repetitions) {
|
|
|
|
const readable = Readable({
|
|
|
|
objectMode: true,
|
|
|
|
});
|
|
|
|
let i = 0;
|
|
|
|
readable._read = () => readable.push(
|
2023-01-30 02:13:35 +08:00
|
|
|
i++ >= repetitions ? null : loremIpsum,
|
lib: performance improvement on readline async iterator
Using a direct approach to create the readline async iterator
allowed an iteration over 20 to 58% faster.
**BREAKING CHANGE**: With that change, the async iteterator
obtained from the readline interface doesn't have the
property "stream" any longer. This happened because it's no
longer created through a Readable, instead, the async
iterator is created directly from the events of the readline
interface instance, so, if anyone is using that property,
this change will break their code.
Also, the Readable added a backpressure control that is
fairly compensated by the use of FixedQueue + monitoring
its size. This control wasn't really precise with readline
before, though, because it only pauses the reading of the
original stream, but the lines generated from the last
message received from it was still emitted. For example:
if the readable was paused at 1000 messages but the last one
received generated 10k lines, but no further messages were
emitted again until the queue was lower than the readable
highWaterMark. A similar behavior still happens with the
new implementation, but the highWaterMark used is fixed: 1024,
and the original stream is resumed again only after the queue
is cleared.
Before making that change, I created a package implementing
the same concept used here to validate it. You can find it
[here](https://github.com/Farenheith/faster-readline-iterator)
if this helps anyhow.
PR-URL: https://github.com/nodejs/node/pull/41276
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
2022-10-24 20:49:16 +08:00
|
|
|
);
|
|
|
|
return readable;
|
|
|
|
}
|
|
|
|
|
2023-09-23 16:17:14 +08:00
|
|
|
async function main({ n, type }) {
|
lib: performance improvement on readline async iterator
Using a direct approach to create the readline async iterator
allowed an iteration over 20 to 58% faster.
**BREAKING CHANGE**: With that change, the async iteterator
obtained from the readline interface doesn't have the
property "stream" any longer. This happened because it's no
longer created through a Readable, instead, the async
iterator is created directly from the events of the readline
interface instance, so, if anyone is using that property,
this change will break their code.
Also, the Readable added a backpressure control that is
fairly compensated by the use of FixedQueue + monitoring
its size. This control wasn't really precise with readline
before, though, because it only pauses the reading of the
original stream, but the lines generated from the last
message received from it was still emitted. For example:
if the readable was paused at 1000 messages but the last one
received generated 10k lines, but no further messages were
emitted again until the queue was lower than the readable
highWaterMark. A similar behavior still happens with the
new implementation, but the highWaterMark used is fixed: 1024,
and the original stream is resumed again only after the queue
is cleared.
Before making that change, I created a package implementing
the same concept used here to validate it. You can find it
[here](https://github.com/Farenheith/faster-readline-iterator)
if this helps anyhow.
PR-URL: https://github.com/nodejs/node/pull/41276
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
2022-10-24 20:49:16 +08:00
|
|
|
bench.start();
|
|
|
|
let lineCount = 0;
|
|
|
|
|
|
|
|
const iterable = readline.createInterface({
|
|
|
|
input: getLoremIpsumStream(n),
|
|
|
|
});
|
|
|
|
|
2023-09-23 16:17:14 +08:00
|
|
|
const readlineIterable = type === 'old' ? oldWay.call(iterable) : iterable;
|
|
|
|
|
lib: performance improvement on readline async iterator
Using a direct approach to create the readline async iterator
allowed an iteration over 20 to 58% faster.
**BREAKING CHANGE**: With that change, the async iteterator
obtained from the readline interface doesn't have the
property "stream" any longer. This happened because it's no
longer created through a Readable, instead, the async
iterator is created directly from the events of the readline
interface instance, so, if anyone is using that property,
this change will break their code.
Also, the Readable added a backpressure control that is
fairly compensated by the use of FixedQueue + monitoring
its size. This control wasn't really precise with readline
before, though, because it only pauses the reading of the
original stream, but the lines generated from the last
message received from it was still emitted. For example:
if the readable was paused at 1000 messages but the last one
received generated 10k lines, but no further messages were
emitted again until the queue was lower than the readable
highWaterMark. A similar behavior still happens with the
new implementation, but the highWaterMark used is fixed: 1024,
and the original stream is resumed again only after the queue
is cleared.
Before making that change, I created a package implementing
the same concept used here to validate it. You can find it
[here](https://github.com/Farenheith/faster-readline-iterator)
if this helps anyhow.
PR-URL: https://github.com/nodejs/node/pull/41276
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
2022-10-24 20:49:16 +08:00
|
|
|
// eslint-disable-next-line no-unused-vars
|
2023-09-23 16:17:14 +08:00
|
|
|
for await (const _ of readlineIterable) {
|
lib: performance improvement on readline async iterator
Using a direct approach to create the readline async iterator
allowed an iteration over 20 to 58% faster.
**BREAKING CHANGE**: With that change, the async iteterator
obtained from the readline interface doesn't have the
property "stream" any longer. This happened because it's no
longer created through a Readable, instead, the async
iterator is created directly from the events of the readline
interface instance, so, if anyone is using that property,
this change will break their code.
Also, the Readable added a backpressure control that is
fairly compensated by the use of FixedQueue + monitoring
its size. This control wasn't really precise with readline
before, though, because it only pauses the reading of the
original stream, but the lines generated from the last
message received from it was still emitted. For example:
if the readable was paused at 1000 messages but the last one
received generated 10k lines, but no further messages were
emitted again until the queue was lower than the readable
highWaterMark. A similar behavior still happens with the
new implementation, but the highWaterMark used is fixed: 1024,
and the original stream is resumed again only after the queue
is cleared.
Before making that change, I created a package implementing
the same concept used here to validate it. You can find it
[here](https://github.com/Farenheith/faster-readline-iterator)
if this helps anyhow.
PR-URL: https://github.com/nodejs/node/pull/41276
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
2022-10-24 20:49:16 +08:00
|
|
|
lineCount++;
|
|
|
|
}
|
|
|
|
bench.end(lineCount);
|
|
|
|
}
|