1. Use an in-memory database
In-memory database is actually a database that puts data in
memory and operates directly. Compared with the disk, the data read and write
speed of the memory is several orders of magnitude higher. Saving the data in
the memory can greatly improve the performance of the application compared to
accessing it from the disk. The memory database abandoned the traditional way
of disk data management, redesigned the architecture based on all data in
memory, and made corresponding improvements in data caching, fast algorithms,
and parallel operations, so the data processing speed is faster than that of
traditional databases. Data processing speed is much faster.
But the problem of security can be said
to be the biggest flaw in the memory database. Because the memory itself has
the natural defect of power loss, when we use the memory database, we usually
need to take some protection mechanisms for the data on the memory in advance,
such as backup, log recording, hot backup or clustering, and synchronization
with the disk database, etc. the way. For some data that is not of high
importance but want to quickly respond to user requests, you can consider a
memory database to store it, and at the same time, you can periodically
solidify the data to disk.
2. Use RDD
In some applications related to big data cloud computing,
Spark can be used to speed up data processing. The core of Spark is RDD. The
earliest source of RDD is "Resilient Distributed Datasets: A
Fault-Tolerant Abstraction for In-Memory Cluster Computing" from Berkeley
Lab. The existing data flow system is not efficient in processing two
applications: one is iterative algorithm, which is very common in the field of
graph applications and machine learning; the other is interactive data mining
tools. In both cases, keeping the data in memory can greatly improve
performance.
3. Increase the cache
Many web applications have a large amount of static content.
These static content are mainly small files and are frequently read. Apache and
nginx are used as web servers. When the web traffic is not large, these two
http servers can be said to be very fast and efficient. If the load is large,
we can build a cache server on the front end to cache the static resource files
in the server to the operation The read operation is performed directly in the
system memory, because the speed of reading data directly from the memory is
much faster than reading from the hard disk. This is actually to increase the
cost of memory to reduce the time consumption of accessing the disk.
4. Use SSD
In addition to the optimization of memory, you can also
optimize the disk side. Compared with traditional mechanical hard drives, solid
state drives have the characteristics of fast read and write, light weight, low
energy consumption and small size. However, the price of SSDs is more expensive
than traditional mechanical hard drives, and SSDs can be used to replace
mechanical hard drives if conditions permit.
5. Optimize the database
Most server requests will eventually fall into the database.
As the amount of data increases, the access speed of the database will become
slower and slower. To increase the request processing speed, the original
single table must be moved. At present, the database used by mainstream Linux
servers must belong to mysql. If we use mysql to store data in a single table
with records reaching tens of millions, the query speed will be very slow. The
database is partitioned and tabled according to the appropriate rules in the
business, which can effectively improve the access speed of the database and
improve the overall performance of the server. In addition, for business query
requests, indexes can be set up according to relevant requirements when
building tables to improve query speed.
6. Choose the right IO model
The IO model is divided into:
(1). Blocking I/O model: I/O is
blocked until the data arrives, and if the data arrives, it will return. The
typical is recvfrom, and the general default is blocking.
(2). Non-blocking I/O model:
Contrary to blocking, as long as the result cannot be obtained, I/O returns
immediately. Will not block the current thread.
IO reuse model: that is the part
that you want to learn. Multiplexing means that multiple signals are combined
into one channel for processing, similar to multiple pipes converging into one
pipe, and the opposite is demultiplexing.
The IO multiplexing model is
mainly select, poll, and epoll; for an IO port, two calls and two returns are
not superior to blocking IO; the key is to be able to monitor multiple IO ports
at the same time; the function will also Blocking the process, but unlike blocking
I/O, these two functions can block multiple I/O operations at the same time.
And it can detect the I/O functions of multiple read operations and multiple
write operations at the same time, and only call the I/O operation function
until there is data readable or writable.
Signal drive: First, turn on the socket
signal drive I/O function, and install a signal processing function through the
system call sigaction. When the datagram is ready to be read, a SIGIO signal is
generated for the process. Then you can call recvfrom in the signal processing
program to read the datagram, and notify the main loop that the data is ready
to be processed. You can also notify the main loop to read the datagram.
Asynchronous IO model: Tell the
kernel to start an operation, and let the kernel notify us after the entire
operation (including copying data from the kernel to the user's own buffer).
This is not to say that a certain model must be used, and epoll does not have
better performance than select in all cases. It is still necessary to combine
business needs when selecting.
7. Use a multi-core processing strategy
Nowadays, the mainstream machine configurations running
servers are all multi-core CPUs. When designing servers, we can take advantage
of the multi-core features and adopt a multi-process or multi-threaded
framework. Regarding the choice of multi-threading or multi-process, you can
choose according to actual needs, combined with their respective advantages and
disadvantages. For the use of multiple threads, especially when using thread
pools, you can set a suitable thread pool by testing the performance of
different thread pool servers.
8. Distributed deployment program
When the stand-alone server has not been able to find a
suitable optimization point, we can improve the responsiveness of the server
through distributed deployment. Excellent server development will propose some
solutions for the expansion and disaster recovery of their own servers. I
personally feel that it is better to be simple when designing the server, so
that it will be convenient for later expansion.
Comments
Post a Comment