Solid surface arrays display hundreds to thousands of antigens on a surface such as nitrocellulose coated glass [
4]. These arrays were originally adapted from oligonucleotide arrays, which were developed in the 1990s and consisted of DNA fragments displayed on a solid surface. However, the initial oligonucleotide solid surface arrays were based on whole-genomes and included non-expressed genes that are not biologically relevant to disease. This was overcome by the advent of solid surface arrays that incorporated only proteins and peptides expressed in vivo. Solid surface arrays now provide a high-throughput method used to investigate protein-protein interactions, enzyme-substrate reactions, protein-drug interactions, and antibody-antigen interactions. Detection methods of solid surface arrays include fluorescence, mass spectrometry and surface plasmon resonance.
Protein arrays
Protein arrays comprise full-length proteins or protein domains being displayed on the solid surface. Some of the first high content protein arrays were established in the year 2000, with over 10,000 full-length human proteins displayed on a glass slide [
5]. Since then, protein arrays displaying human proteins have been used for a wide range of applications, including detecting novel antibody biomarkers in autoimmune diseases. Protein arrays can be divided into two categories based on their method of protein synthesis: those that are produced using cellular expression, or those produced using cell-free methods such as in situ synthesis. Cellular produced arrays utilise recombinant proteins expressed in a host organism prior to immobilising on the solid surface. The choice of the expression host is important to consider with common host organisms including bacteria (mostly commonly
Escherichia coli),
Saccharomyces cerevisiae cells (yeast), insect cells, and human cells. The expression host can affect the protein glycosylation and other post-translational modifications such as citrullination, and it is known that antibodies have the ability to recognise these post-translational modifications [
3,
6]. Proteins expressed in yeast and insect cells will be glycosylated, unlike proteins expressed in bacterial hosts such as
E. coli, though the glycosylation patterns still differ to that of humans. The number of proteins included in the array is limited to the labour and cost associated with expression and purification of the proteins. This led to the development of cell-free or in situ protein array synthesis, in which proteins are synthesised directly onto the solid surface [
7]. DNA or RNA libraries are bound to the solid surface, and just hours prior to running a sample across the array, the proteins are freshly synthesised, minimising the risk of protein denaturation [
7].
While custom protein arrays can be produced for a particular purpose, there are a variety of protein arrays available commercially. One of the most widely used for the discovery of novel biomarkers in autoimmune diseases is the Human ProtoArray (Thermo Fisher Scientific), based on over 9000 human proteins expressed with an N-terminal GST-tag in baculovirus, purified from SF9 insect cells and bound to a nitrocellulose-coated glass slide. The N-terminal GST tag allows for the direction of the binding of the protein on the solid surface to be controlled, which may also influence how accessible epitopes within each protein are displayed. The majority of antigens represented on the ProtoArray are metabolism-based proteins (35%), while 16% of displayed proteins are membrane associated or secreted, and 5% are nuclear based proteins. Although the Human ProtoArray was discontinued in 2018, many research groups have employed the platform to discover autoantibodies in autoimmune diseases (Table
1) including type 1 diabetes [
8], systemic lupus erythematosus (SLE) [
4], multiple sclerosis [
9], and acute rheumatic fever [
10].
Table 1
Description of the different human autoantigen profiling technologies discussed, including the advantages and limitations of each, and autoimmune diseases the technology has been applied to
Protein Arrays
| Full-length proteins |
Advantages:
-Can detect autoantibodies against conformational and discontinuous epitopes. -Use of insect and yeast cells to produce antigens result in conformational epitopes and post-translational modifications to be represented.
Limitations:
-Laborious and expensive to produce antigens. | The Human ProtoArray | >9,000 | Nitrocellulose treated glass slide | -Parkinson’s disease ([ 51] -Systemic lupus erythematosus (SLE) [ 4] -Ankylosing spondylitis [ 52] -Alzheimer’s disease [ 54] -Acute Rheumatic Fever [ 10] -Autoimmune Polyendocrine Syndrome type 1 (APS1) [ 48] -Paediatric acute disseminated encephalomyelitis [ 55] |
HuProt Array | >20,000 | Glass slide | -Acute Rheumatic Fever [ 10] -Primary biliary cirrhosis [ 56] -Multisystem inflammatory syndrome in children with a previous SARS-CoV-2 infection [ 36] |
NAPPA | 80 to 12,000- | Amine coated glass slide | -Type one diabetes [ 15, 16] -Ankylosing spondylitis [ 58] -Rheumatoid Arthritis [ 17] |
i-Ome Protein Array Kit | >1,600 | Glass slide | -Rheumatoid Arthritis [ 20] |
ImmunoINSIGHTS | 80-8,000 | Microspheres | -Rheumatoid arthritis [ 21, 22] |
In House Protein Arrays | 5011 | EpoxySlide | |
Peptide and Protein Fragment Arrays
| Short peptides ranging 4-20 amino acids in length |
Advantages:
-Can represent the entire human proteome, providing greater breadth for autoantigen discovery. -Ideal for mapping epitopes.
Limitations:
-Conformational and discontinuous epitopes not represented. -Prokaryotic expression hosts used to produce the antigens result in post-translational modifications not represented. | The human proteome peptide array | Approx. 2.2 million | Amino functionalised microscope slides | |
PEPstar BioSynth | Between 50-5,000 | Glass slides | |
Longer protein fragments ranging 80-100 amino acids in length | The Human Peptide Array | Up to 42,000 | Glass slides | -Multiple Sclerosis [ 12, 27] -Rheumatoid Arthritis [ 17, 30] |
PhIP-Seq
| Peptides 36 amino acids in length |
Advantages:
-Antigens are both produced and displayed by bacteriophage, reducing cost and labour. -Displays the entire human proteome including different splicing variants. -Greater throughput of samples. -Use of next-generation sequencing allows for magnitude of autoantibody response to be investigated through the number of reads detected.
Limitations:
-Lack of post-translational modifications. -Lack of complex secondary structure in the antigens displayed. | T7 Pep Library | >400,000 | Bacteriophage | -Rheumatoid Arthritis [ 32]. |
Peptides 49 amino acids in length | Human PhIP-Seq Library v2 | >700,000 | Bacteriophage | -Paraneoplastic neurological disorders [ 35, 61] |
Peptides 90 amino acids in length | -Sarcoidosis library -MIS-C library -Antygen HuScan commercial library | -1152 -250,000 >250,000 | Bacteriophage | -Multisystem inflammatory syndrome in children with a previous SARS-CoV-2 infection [ 36] |
MIPSA
| Full-length proteins |
Advantages:
-Can detect autoantibodies against conformational and discontinuous epitopes -Greater throughput of samples -Use of next-generation sequencing allows for magnitude of autoantibody response to be investigated through the number of reads detected.
Limitations:
-Lack of post-translational modifications | MIPSA | >11,000 | RNAClean XP beads | -Autoantibody reactivity in severe SARS-CoV-2 infections [ 38] |
REAP
| Extracellular and secreted proteins ranging 50-600 amino acids in length |
Advantages:
-Eukaryotic expression host results in antigen folding and post-translational modifications more like humans. -Greater through-put of samples -Use of next-generation sequencing allows for magnitude of autoantibody response to be investigated through the number of reads detected.
Limitations:
-Differing glycosylation patterns compared to humans. -Does not represent the entire human proteome. | REAP | 2,688 | Yeast cells | -Autoantibodies after a SARS-CoV-2 infection [ 41] |
The more recent and most expansive protein array currently available is the HuProt Human Proteome microarray (v4.0, CDI laboratories). The 20,000 proteins cover 81% of the human proteome including 87% of predicted secreted proteins and 78% of plasma membrane proteins based on the Human Protein Atlas
. This increased proportion of secreted and plasma membrane proteins offers an advantage compared to the Human ProtoArray for detection of novel autoantibody biomarkers that are more likely to target extracellular and exposed antigens. The HuProt Array has been employed to detect novel autoantibodies in several autoimmune diseases (Table
1) including autoimmune hepatitis [
11], multiple sclerosis [
12], acute rheumatic fever [
10] and SLE [
13].
Nucleic Acid Programmable Protein Array (NAPPA) is a popular format of in situ protein synthesis in which proteins are synthesised and captured directly onto the solid surface. For example, a NAPPA developed at the BioDesign Institute in the USA, includes over 12,000 genes encoded to express proteins with a C-terminal GST tag to ensure full translation of the protein prior to local capture on the solid surface [
14]. Mammalian cell-free lysates and accessory proteins are added to synthesise the proteins in vitro enabling simple post-translational modifications to be represented such as phosphorylation and citrullination. Autoimmune diseases in which NAPPA has been utilised (Table
1) include type 1 diabetes [
15,
16] and rheumatoid arthritis [
17].
An emerging human protein array approach is the i-Ome Protein Array Kit (Sengenics), which comprises over 1600 antigens including signalling molecules and cytokines. The proteins are expressed in insect cells with a carboxy-biotin carrier protein signal that marks for correct protein folding to ensure display of conformational epitopes. While the inclusion of structural epitopes is an advantage of the i-Ome array, a limitation is the size, particularly as a discovery technology when the potential breadth of the autoantibody repertoire in autoimmune disease is considered. Examples of autoimmune diseases the i-Ome Protein Array Kit has been utilised to detect novel autoantibodies include SLE [
18,
19] and rheumatoid arthritis [
20] (Table
1). Another emerging protein array is ImmunoINSIGHTS (formally known as Serotag (Oncimmune)), which differs from the prior examples based on planar solid surfaces, as the over 8000 human proteins are immobilised on Luminex bead-based suspension arrays enabling solution phase antigen binding. Although a relatively new technology, it has been utilised (Table
1) in rheumatoid arthritis [
21,
22] and SLE [
23].
Peptide and protein fragment arrays
In contrast to protein arrays, peptide arrays comprise of small protein fragments or peptides, rather than full-length proteins. Peptide arrays with overlapping, or tiled, peptides (6–20 amino acids in length) are generally used for epitope mapping within proteins that have previously been associated with autoimmune diseases [
24‐
26]. Arrays that employ longer peptide or protein fragments (80–100 amino acids in length) are more often used to discover novel antigens associated with an autoimmune disease. Longer peptides or protein fragments are more likely to have some secondary structure and contain conformational epitopes present in native proteins, compared with short peptides [
12,
27]. Like protein arrays, a limiting factor in array production is the labour and cost associated with peptide synthesis. In situ synthesis is also available for peptide arrays, in which the peptides are synthesised in parallel directly onto the solid surface [
28]. Commercial peptide array synthesisers such as the
Multipep Synthetiser (Invatis) allow for research groups to design and synthesise custom peptide arrays relevant to the autoimmune disease being studied [
29].
Examples of autoimmune diseases for which peptide arrays have been utilised to map epitopes within previously identified autoantigens include multiple sclerosis [
26] and SLE [
25] (Table
1). The largest peptide array to date contains approximately 2.2 million overlapping 12 amino acid long peptides that represent the entire human proteome and has been used to discover novel autoantibodies in multiple sclerosis [
24]. Other examples of commercially available short peptide arrays include PEPperCHIP (PEPperPRINT), BioSynth (BioSynth), and PEPstar and Pepspots (JPT Innovative Peptide Solutions), which can provide custom-designed or standard arrays based on protein sequences.
The Human Peptide Array (SciLifeLab) utilises longer peptides and protein fragments ranging 80–100 amino acids in length allowing for novel autoantigen discovery. The array is based on unique Protein Epitope Signature Tags (PrESTs) designed by the Human Protein Atlas. These PrESTs have low homology to other human protein sequences so that every fragment is unique. The 42,000 fragments, representing 94% of the human proteome, are routinely expressed in
E. coli, purified, and immobilised on microarrays to create the Human Peptide Arrays. Examples of autoimmune diseases in which these peptide arrays have been used include multiple sclerosis [
12,
27], rheumatoid arthritis [
17,
30], and sarcoidosis [
31] (Table
1).