# AI-Driven Watermarking Technique for Safeguarding Text Integrity in the Digital Age

Contents

`!pip -q install langchain huggingface_hub transformers sentence_transformers accelerate bitsandbytes`

```
import os
os.environ['HUGGINGFACEHUB_API_TOKEN'] = '<ENTER_HUGGING_FACE_API_KEY>'
```

```
# text1 = "A common use case when generating images is to generate a batch of images, select one image and improve it with a better, more detailed prompt in a second run. To do this, one needs to make each generated image of the batch deterministic. Images are generated by denoising gaussian random noise which can be instantiated by passing a torch generator."
text1 = """
Once upon a time there was a dear little girl who was loved by every one who looked at her, but most of all by her grandmother, and there was nothing that she would not have given to the child. Once she gave her a little cap of red velvet, which suited her so well that she would never wear anything else. So she was always called Little Red Riding Hood.
One day her mother said to her, "Come, Little Red Riding Hood, here is a piece of cake and a bottle of wine. Take them to your grandmother, she is ill and weak, and they will do her good. Set out before it gets hot, and when you are going, walk nicely and quietly and do not run off the path, or you may fall and break the bottle, and then your grandmother will get nothing. And when you go into her room, don't forget to say, good-morning, and don't peep into every corner before you do it."
I will take great care, said Little Red Riding Hood to her mother, and gave her hand on it.
The grandmother lived out in the wood, half a league from the village, and just as Little Red Riding Hood entered the wood, a wolf met her. Little Red Riding Hood did not know what a wicked creature he was, and was not at all afraid of him.
"Good-day, Little Red Riding Hood," said he.
"Thank you kindly, wolf."
"Whither away so early, Little Red Riding Hood?"
"To my grandmother's."
"What have you got in your apron?"
"Cake and wine. Yesterday was baking-day, so poor sick grandmother is to have something good, to make her stronger."
"Where does your grandmother live, Little Red Riding Hood?"
"A good quarter of a league farther on in the wood. Her house stands under the three large oak-trees, the nut-trees are just below. You surely must know it," replied Little Red Riding Hood.
The wolf thought to himself, "What a tender young creature. What a nice plump mouthful, she will be better to eat than the old woman. I must act craftily, so as to catch both." So he walked for a short time by the side of Little Red Riding Hood, and then he said, "see Little Red Riding Hood, how pretty the flowers are about here. Why do you not look round. I believe, too, that you do not hear how sweetly the little birds are singing. You walk gravely along as if you were going to school, while everything else out here in the wood is merry."
Little Red Riding Hood raised her eyes, and when she saw the sunbeams dancing here and there through the trees, and pretty flowers growing everywhere, she thought, suppose I take grandmother a fresh nosegay. That would please her too. It is so early in the day that I shall still get there in good time. And so she ran from the path into the wood to look for flowers. And whenever she had picked one, she fancied that she saw a still prettier one farther on, and ran after it, and so got deeper and deeper into the wood.
Meanwhile the wolf ran straight to the grandmother's house and knocked at the door.
"Who is there?"
"Little Red Riding Hood," replied the wolf. "She is bringing cake and wine. Open the door."
"Lift the latch," called out the grandmother, "I am too weak, and cannot get up."
The wolf lifted the latch, the door sprang open, and without saying a word he went straight to the grandmother's bed, and devoured her. Then he put on her clothes, dressed himself in her cap, laid himself in bed and drew the curtains.
Little Red Riding Hood, however, had been running about picking flowers, and when she had gathered so many that she could carry no more, she remembered her grandmother, and set out on the way to her.
She was surprised to find the cottage-door standing open, and when she went into the room, she had such a strange feeling that she said to herself, oh dear, how uneasy I feel to-day, and at other times I like being with grandmother so much.
She called out, "Good morning," but received no answer. So she went to the bed and drew back the curtains. There lay her grandmother with her cap pulled far over her face, and looking very strange.
"Oh, grandmother," she said, "what big ears you have."
"The better to hear you with, my child," was the reply.
"But, grandmother, what big eyes you have," she said.
"The better to see you with, my dear."
"But, grandmother, what large hands you have."
"The better to hug you with."
"Oh, but, grandmother, what a terrible big mouth you have."
"The better to eat you with."
And scarcely had the wolf said this, than with one bound he was out of bed and swallowed up Little Red Riding Hood.
When the wolf had appeased his appetite, he lay down again in the bed, fell asleep and began to snore very loud. The huntsman was just passing the house, and thought to himself, how the old woman is snoring. I must just see if she wants anything.
So he went into the room, and when he came to the bed, he saw that the wolf was lying in it. "Do I find you here, you old sinner," said he. "I have long sought you."
Then just as he was going to fire at him, it occurred to him that the wolf might have devoured the grandmother, and that she might still be saved, so he did not fire, but took a pair of scissors, and began to cut open the stomach of the sleeping wolf.
When he had made two snips, he saw the Little Red Riding Hood shining, and then he made two snips more, and the little girl sprang out, crying, "Ah, how frightened I have been. How dark it was inside the wolf."
And after that the aged grandmother came out alive also, but scarcely able to breathe. Little Red Riding Hood, however, quickly fetched great stones with which they filled the wolf's belly, and when he awoke, he wanted to run away, but the stones were so heavy that he collapsed at once, and fell dead.
Then all three were delighted. The huntsman drew off the wolf's skin and went home with it. The grandmother ate the cake and drank the wine which Little Red Riding Hood had brought, and revived, but Little Red Riding Hood thought to herself, as long as I live, I will never by myself leave the path, to run into the wood, when my mother has forbidden me to do so.It is also related that once when Little Red Riding Hood was again taking cakes to the old grandmother, another wolf spoke to her, and tried to entice her from the path. Little Red Riding Hood, however, was on her guard, and went straight forward on her way, and told her grandmother that she had met the wolf, and that he had said good-morning to her, but with such a wicked look in his eyes, that if they had not been on the public road she was certain he would have eaten her up. "Well," said the grandmother, "we will shut the door, that he may not come in." Soon afterwards the wolf knocked, and cried, "open the door, grandmother, I am Little Red Riding Hood, and am bringing you some cakes." But they did not speak, or open the door, so the grey-beard stole twice or thrice round the house, and at last jumped on the roof, intending to wait until Little Red Riding Hood went home in the evening, and then to steal after her and devour her in the darkness. But the grandmother saw what was in his thoughts. In front of the house was a great stone trough, so she said to the child, take the pail, Little Red Riding Hood. I made some sausages yesterday, so carry the water in which I boiled them to the trough. Little Red Riding Hood carried until the great trough was quite full. Then the smell of the sausages reached the wolf, and he sniffed and peeped down, and at last stretched out his neck so far that he could no longer keep his footing and began to slip, and slipped down from the roof straight into the great trough, and was drowned. But Little Red Riding Hood went joyously home, and no one ever did anything to harm her again.
"""
```

`text1 = "Quantum computing is a rapidly evolving field that leverages the principles of quantum mechanics to perform computations that are infeasible for classical computers. Unlike classical computers, which use bits as the fundamental unit of information, quantum computers use quantum bits or qubits. Qubits can exist in multiple states simultaneously due to the principles of superposition and entanglement, providing a significant advantage in solving complex computational problems."`

```
from transformers import pipeline, AutoTokenizer, AutoModelForMaskedLM
import torch
def watermark_text(text, model_name="bert-base-uncased", offset=0):
# Clean and split the input text
text = " ".join(text.split())
words = text.split()
# Replace every fifth word with [MASK], starting from the offset
for i in range(offset, len(words)):
if (i + 1 - offset) % 5 == 0:
words[i] = '[MASK]'
# Initialize the tokenizer and model, move to GPU if available
device = 0 if torch.cuda.is_available() else -1
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForMaskedLM.from_pretrained(model_name).to(device)
# Initialize the fill-mask pipeline
classifier = pipeline("fill-mask", model=model, tokenizer=tokenizer, device=device)
# Make a copy of the words list to modify it
watermarked_words = words.copy()
# Process the text in chunks
for i in range(offset, len(words), 5):
chunk = " ".join(watermarked_words[:i+9])
if '[MASK]' in chunk:
try:
tempd = classifier(chunk)
except Exception as e:
print(f"Error processing chunk '{chunk}': {e}")
continue
if tempd:
templ = tempd[0]
temps = templ['token_str']
watermarked_words[i+4] = temps.split()[0]
# print("Done ", i + 1, "th word")
# Output the results
# print("Original Text:")
# print(text)
# print("Watermark Areas:")
# print(" ".join(words))
# print("Watermarked Text:")
# print(" ".join(watermarked_words))
return " ".join(watermarked_words)
# Example usage
text = "Quantum computing is a rapidly evolving field that leverages the principles of quantum mechanics to perform computations that are infeasible for classical computers. Unlike classical computers, which use bits as the fundamental unit of information, quantum computers use quantum bits or qubits. Qubits can exist in multiple states simultaneously due to the principles of superposition and entanglement, providing a significant advantage in solving complex computational problems."
watermark_text(text, offset=0)
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

`'Quantum computing is a rapidly evolving field that leverages the principles of quantum mechanics to perform computations that are impossible for classical computers. Unlike quantum computers, which use bits as the fundamental unit of , quantum computers use quantum bits or qubits. Qubits can exist in multiple states simultaneously according to the principles of symmetry and entanglement, providing a significant advantage in solving complex mathematical problems.'`

```
from transformers import pipeline, AutoTokenizer, AutoModelForMaskedLM
import torch
def watermark_text_and_calculate_matches(text, model_name="bert-base-uncased", max_offset=5):
# Clean and split the input text
text = " ".join(text.split())
words = text.split()
# Initialize the tokenizer and model, move to GPU if available
device = 0 if torch.cuda.is_available() else -1
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForMaskedLM.from_pretrained(model_name).to(device)
# Initialize the fill-mask pipeline
classifier = pipeline("fill-mask", model=model, tokenizer=tokenizer, device=device)
# Dictionary to store match ratios for each offset
match_ratios = {}
# Loop over each offset
for offset in range(max_offset):
# Replace every fifth word with [MASK], starting from the offset
modified_words = words.copy()
for i in range(offset, len(modified_words)):
if (i + 1 - offset) % 5 == 0:
modified_words[i] = '[MASK]'
# Make a copy of the modified words list to work on
watermarked_words = modified_words.copy()
total_replacements = 0
total_matches = 0
# Process the text in chunks
for i in range(offset, len(modified_words), 5):
chunk = " ".join(watermarked_words[:i+9])
if '[MASK]' in chunk:
try:
tempd = classifier(chunk)
except Exception as e:
print(f"Error processing chunk '{chunk}': {e}")
continue
if tempd:
templ = tempd[0]
temps = templ['token_str']
original_word = words[i+4]
replaced_word = temps.split()[0]
watermarked_words[i+4] = replaced_word
# Increment total replacements and matches
total_replacements += 1
if replaced_word == original_word:
total_matches += 1
# Calculate the match ratio for the current offset
if total_replacements > 0:
match_ratio = total_matches / total_replacements
else:
match_ratio = 0
match_ratios[offset] = match_ratio
# Return the match ratios for each offset
return match_ratios
# Example usage
text = "Quantum computing is a rapidly evolving field that leverages the principles of quantum mechanics to perform computations that are infeasible for classical computers. Unlike classical computers, which use bits as the fundamental unit of information, quantum computers use quantum bits or qubits. Qubits can exist in multiple states simultaneously due to the principles of superposition and entanglement, providing a significant advantage in solving complex computational problems."
# Calculate match ratios
match_ratios = watermark_text_and_calculate_matches(text, max_offset=5)
print(match_ratios)
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
{0: 0.5384615384615384, 1: 0.6153846153846154, 2: 0.5833333333333334, 3: 0.6666666666666666, 4: 0.5833333333333334}
```

```
from scipy.stats import ttest_1samp
import numpy as np
def check_significant_difference(match_ratios):
# Extract ratios into a list
ratios = list(match_ratios.values())
# Find the highest ratio
highest_ratio = max(ratios)
# Find the average of the other ratios
other_ratios = [r for r in ratios if r != highest_ratio]
average_other_ratios = np.mean(other_ratios)
# Perform a t-test to compare the highest ratio to the average of the others
t_stat, p_value = ttest_1samp(other_ratios, highest_ratio)
# Print the results
print(f"Highest Match Ratio: {highest_ratio}")
print(f"Average of Other Ratios: {average_other_ratios}")
print(f"T-Statistic: {t_stat}")
print(f"P-Value: {p_value}")
# Determine if the difference is statistically significant (e.g., at the 0.05 significance level)
if p_value < 0.05:
print("The highest ratio is significantly different from the others.")
else:
print("The highest ratio is not significantly different from the others.")
return [highest_ratio, average_other_ratios, t_stat, p_value]
# Example usage
text = "Quantum computing is a rapidly evolving field that leverages the principles of quantum mechanics to perform computations that are infeasible for classical computers. Unlike classical computers, which use bits as the fundamental unit of information, quantum computers use quantum bits or qubits. Qubits can exist in multiple states simultaneously due to the principles of superposition and entanglement, providing a significant advantage in solving complex computational problems."
# match_ratios = watermark_text_and_calculate_matches(text, max_offset=5)
# check_significant_difference(match_ratios)
```

```
import random
def randomly_add_words(text, words_to_add, num_words_to_add):
# Clean and split the input text
text = " ".join(text.split())
words = text.split()
# Insert words randomly into the text
for _ in range(num_words_to_add):
# Choose a random position to insert the word
position = random.randint(0, len(words))
# Choose a random word to insert
word_to_insert = random.choice(words_to_add)
# Insert the word at the random position
words.insert(position, word_to_insert)
# Join the list back into a string and return the modified text
modified_text = " ".join(words)
return modified_text
# Example usage
text = "Quantum computing is a rapidly evolving field that leverages the principles of quantum mechanics to perform computations that are infeasible for classical computers. Unlike classical computers, which use bits as the fundamental unit of information, quantum computers use quantum bits or qubits. Qubits can exist in multiple states simultaneously due to the principles of superposition and entanglement, providing a significant advantage in solving complex computational problems."
words_to_add = ["example", "test", "random", "insert"]
num_words_to_add = 5
# modified_text = randomly_add_words(text, words_to_add, num_words_to_add)
modified_text = randomly_add_words(watermark_text(text, offset=0), words_to_add, num_words_to_add)
print("Original Text:")
print(text)
print("\nModified Text:")
print(modified_text)
match_ratios = watermark_text_and_calculate_matches(modified_text, max_offset=5)
print(match_ratios)
check_significant_difference(match_ratios)
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Done 1 th word
Done 6 th word
Done 11 th word
Done 16 th word
Done 21 th word
Done 26 th word
Done 31 th word
Done 36 th word
Done 41 th word
Done 46 th word
Done 51 th word
Done 56 th word
Done 61 th word
Original Text:
Quantum computing is a rapidly evolving field that leverages the principles of quantum mechanics to perform computations that are infeasible for classical computers. Unlike classical computers, which use bits as the fundamental unit of information, quantum computers use quantum bits or qubits. Qubits can exist in multiple states simultaneously due to the principles of superposition and entanglement, providing a significant advantage in solving complex computational problems.
Watermark Areas:
Quantum computing is a [MASK] evolving field that leverages [MASK] principles of quantum mechanics [MASK] perform computations that are [MASK] for classical computers. Unlike [MASK] computers, which use bits [MASK] the fundamental unit of [MASK] quantum computers use quantum [MASK] or qubits. Qubits can [MASK] in multiple states simultaneously [MASK] to the principles of [MASK] and entanglement, providing a [MASK] advantage in solving complex [MASK] problems.
Watermarked Text:
Quantum computing is a rapidly evolving field that leverages the principles of quantum mechanics to perform computations that are impossible for classical computers. Unlike quantum computers, which use bits as the fundamental unit of , quantum computers use quantum bits or qubits. Qubits can exist in multiple states simultaneously according to the principles of symmetry and entanglement, providing a significant advantage in solving complex mathematical problems.
Original Text:
Quantum computing is a rapidly evolving field that leverages the principles of quantum mechanics to perform computations that are infeasible for classical computers. Unlike classical computers, which use bits as the fundamental unit of information, quantum computers use quantum bits or qubits. Qubits can exist in multiple states simultaneously due to the principles of superposition and entanglement, providing a significant advantage in solving complex computational problems.
Modified Text:
Quantum computing is example a rapidly evolving field that leverages the principles of quantum mechanics to perform random computations that are impossible for classical computers. Unlike quantum computers, which use bits as the random insert fundamental unit of , quantum computers use quantum bits or qubits. Qubits can exist in multiple states simultaneously according random to the principles of symmetry and entanglement, providing a significant advantage in solving complex mathematical problems.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
{0: 0.5714285714285714, 1: 0.5714285714285714, 2: 0.5384615384615384, 3: 0.38461538461538464, 4: 0.7692307692307693}
Highest Match Ratio: 0.7692307692307693
Average of Other Ratios: 0.5164835164835164
T-Statistic: -5.66220858504931
P-Value: 0.010908789440745323
The highest ratio is significantly different from the others.
```

```
[0.7692307692307693,
0.5164835164835164,
-5.66220858504931,
0.010908789440745323]
```

```
texts = [
"Artificial intelligence (AI) has seen remarkable advancements in recent years, transforming numerous industries. From healthcare to finance, AI technologies are being leveraged to improve efficiency and decision-making. In healthcare, AI algorithms are being used to analyze medical images, predict patient outcomes, and assist in surgery. Finance professionals are using AI for fraud detection, risk management, and algorithmic trading. Despite these advancements, AI also raises ethical concerns, particularly regarding bias and privacy. Ensuring that AI systems are transparent and fair is critical for their continued adoption and trust. As AI continues to evolve, it is essential to consider both its potential benefits and challenges.",
"Climate change is one of the most pressing issues facing our planet today. Rising global temperatures, melting ice caps, and increasing frequency of extreme weather events are all indicators of this phenomenon. Scientists warn that without significant action to reduce greenhouse gas emissions, the effects of climate change will become more severe. Renewable energy sources such as solar, wind, and hydro power are being promoted as sustainable alternatives to fossil fuels. Additionally, individuals can make a difference by reducing their carbon footprint through actions like using public transportation, conserving energy, and supporting policies aimed at environmental protection.",
"The field of biotechnology is revolutionizing medicine and agriculture. Advances in genetic engineering have enabled scientists to develop crops that are resistant to pests and diseases, as well as produce higher yields. In medicine, biotechnology is being used to create personalized treatments based on an individual's genetic makeup. This approach, known as precision medicine, aims to provide more effective and targeted therapies for various diseases. However, the rapid pace of biotechnological innovation also raises ethical and regulatory questions. It is crucial to balance the benefits of these technologies with the potential risks and ensure that they are used responsibly.",
"Quantum computing is poised to revolutionize the world of computing. Unlike classical computers, which use bits to represent data as 0s and 1s, quantum computers use qubits, which can exist in multiple states simultaneously. This allows quantum computers to perform complex calculations much faster than their classical counterparts. Potential applications of quantum computing include cryptography, drug discovery, and optimization problems. However, building a practical and scalable quantum computer remains a significant challenge. Researchers are exploring various approaches, such as superconducting qubits and trapped ions, to overcome these hurdles and bring quantum computing closer to reality.",
"The internet of things (IoT) is transforming the way we interact with the world around us. IoT refers to the network of interconnected devices that collect and exchange data. These devices range from smart home appliances to industrial sensors, and their applications are vast. In the home, IoT devices can automate tasks like adjusting the thermostat, turning off lights, and monitoring security systems. In industry, IoT is used to optimize supply chains, monitor equipment health, and improve safety. However, the proliferation of IoT devices also raises concerns about security and privacy. Ensuring that these devices are secure and that data is protected is essential for the continued growth of IoT.",
"Renewable energy is gaining momentum as a viable solution to the world's energy needs. Solar, wind, and hydro power are among the most common forms of renewable energy, and they offer a sustainable alternative to fossil fuels. Solar power harnesses energy from the sun using photovoltaic cells, while wind power generates electricity through turbines. Hydropower uses the energy of flowing water to produce electricity. These technologies are being adopted at an increasing rate as countries seek to reduce their carbon emissions and transition to cleaner energy sources. The growth of renewable energy is not without challenges, including the need for improved energy storage solutions and the integration of these technologies into existing power grids.",
"The rise of e-commerce has transformed the retail industry. Online shopping has become increasingly popular, offering consumers convenience and a wide range of products at their fingertips. Major e-commerce platforms like Amazon, Alibaba, and eBay have disrupted traditional brick-and-mortar stores, leading to significant changes in consumer behavior. The COVID-19 pandemic further accelerated the shift to online shopping, as lockdowns and social distancing measures limited in-person shopping. While e-commerce offers many benefits, it also presents challenges, such as the need for efficient logistics and concerns about data privacy. As the industry continues to evolve, companies are exploring new technologies like augmented reality and artificial intelligence to enhance the online shopping experience.",
"Cybersecurity is a critical concern in today's digital age. With the increasing reliance on technology and the internet, the risk of cyberattacks has grown significantly. Cybercriminals use various methods, such as phishing, ransomware, and malware, to exploit vulnerabilities in systems and steal sensitive information. Organizations must implement robust cybersecurity measures to protect their data and infrastructure. This includes using encryption, multi-factor authentication, and regular security audits. Additionally, individuals can take steps to safeguard their personal information, such as using strong passwords and being cautious of suspicious emails. As cyber threats continue to evolve, staying informed and vigilant is essential for maintaining cybersecurity.",
"The field of robotics is advancing rapidly, with applications ranging from manufacturing to healthcare. Industrial robots are used to automate repetitive tasks, improve precision, and increase efficiency in manufacturing processes. In healthcare, robots assist in surgeries, rehabilitation, and patient care. Social robots are being developed to provide companionship and support for the elderly and individuals with disabilities. The integration of artificial intelligence and machine learning has further enhanced the capabilities of robots, enabling them to perform complex tasks and adapt to new situations. However, the rise of robotics also raises ethical and societal questions, such as the impact on employment and the need for responsible development and use of these technologies.",
"Space exploration has captured the imagination of humanity for centuries. Recent advancements in technology have made space missions more feasible and ambitious. Private companies like SpaceX and Blue Origin are playing a significant role in this new era of space exploration. SpaceX's successful launches and plans for Mars colonization have reignited interest in space travel. NASA and other space agencies are also focusing on missions to the Moon, Mars, and beyond. The development of new propulsion systems, space habitats, and life support technologies are critical for the success of these missions. While space exploration holds great promise, it also presents challenges, including the need for international cooperation, funding, and addressing the environmental impact of space activities.",
"Climate change is driving the need for sustainable agriculture practices. Traditional farming methods often rely on chemical fertilizers and pesticides, which can harm the environment and human health. Sustainable agriculture aims to reduce the negative impact of farming by promoting practices that conserve resources, protect biodiversity, and improve soil health. Techniques such as crop rotation, cover cropping, and organic farming are being adopted by farmers worldwide. Additionally, advances in agricultural technology, such as precision farming and vertical farming, are helping to increase efficiency and reduce waste. By embracing sustainable agriculture, we can ensure food security for future generations while protecting the planet.",
"The rise of electric vehicles (EVs) is transforming the automotive industry. EVs offer a cleaner and more sustainable alternative to traditional gasoline-powered vehicles, with lower emissions and reduced dependence on fossil fuels. Major automakers are investing heavily in EV technology, and the market for electric cars is growing rapidly. Advances in battery technology are improving the range and performance of EVs, making them more practical for everyday use. Governments around the world are also supporting the transition to electric vehicles through incentives, subsidies, and the development of charging infrastructure. While challenges remain, such as the need for widespread charging stations and the environmental impact of battery production, the future of transportation is increasingly electric.",
"Artificial intelligence (AI) is transforming the field of education. AI-powered tools and platforms are being used to personalize learning, automate administrative tasks, and provide real-time feedback to students. Personalized learning systems use AI algorithms to analyze student performance and tailor instruction to individual needs. This approach can help improve student outcomes by addressing learning gaps and providing targeted support. AI is also being used to create adaptive assessments, intelligent tutoring systems, and virtual learning environments. While AI in education offers many benefits, it also raises questions about data privacy, the role of teachers, and the need for equitable access to technology. As AI continues to evolve, it has the potential to revolutionize the way we teach and learn.",
"The field of renewable energy is experiencing significant growth as countries seek to reduce their carbon emissions and transition to cleaner energy sources. Solar, wind, and hydro power are among the most common forms of renewable energy, and they offer a sustainable alternative to fossil fuels. Solar power harnesses energy from the sun using photovoltaic cells, while wind power generates electricity through turbines. Hydropower uses the energy of flowing water to produce electricity. These technologies are being adopted at an increasing rate, driven by advancements in technology, falling costs, and supportive government policies. The growth of renewable energy is not without challenges, including the need for improved energy storage solutions and the integration of these technologies into existing power grids.",
"The COVID-19 pandemic has had a profound impact on the world, affecting nearly every aspect of daily life. The pandemic has led to widespread illness, loss of life, and economic disruption. Healthcare systems have been stretched to their limits, and the need for effective treatments and vaccines has become paramount. Scientists and researchers have worked tirelessly to develop vaccines and treatments for COVID-19, leading to the rapid development and distribution of several effective vaccines. The pandemic has also highlighted the importance of public health measures, such as social distancing, mask-wearing, and hand hygiene. As the world continues to grapple with the pandemic, efforts to prevent future outbreaks and improve global health infrastructure are essential.",
"The concept of smart cities is gaining traction as urban areas look for ways to improve efficiency, sustainability, and quality of life for residents. Smart cities leverage technology and data to optimize city services, such as transportation, energy, and waste management. For example, smart traffic management systems can reduce congestion and improve air quality by adjusting traffic signals in real-time based on traffic flow. Smart grids can enhance energy efficiency by balancing supply and demand and integrating renewable energy sources. Additionally, smart waste management systems use sensors to monitor waste levels and optimize collection routes. While smart cities offer many benefits, they also raise concerns about data privacy, cybersecurity, and the need for equitable access to technology.",
"The field of biotechnology is revolutionizing medicine and agriculture. Advances in genetic engineering have enabled scientists to develop crops that are resistant to pests and diseases, as well as produce higher yields. In medicine, biotechnology is being used to create personalized treatments based on an individual's genetic makeup. This approach, known as precision medicine, aims to provide more effective and targeted therapies for various diseases. However, the rapid pace of biotechnological innovation also raises ethical and regulatory questions. It is crucial to balance the benefits of these technologies with the potential risks and ensure that they are used responsibly.",
"The rise of renewable energy is transforming the global energy landscape. Solar, wind, and hydro power are among the most common forms of renewable energy, and they offer a sustainable alternative to fossil fuels. Solar power harnesses energy from the sun using photovoltaic cells, while wind power generates electricity through turbines. Hydropower uses the energy of flowing water to produce electricity. These technologies are being adopted at an increasing rate as countries seek to reduce their carbon emissions and transition to cleaner energy sources. The growth of renewable energy is not without challenges, including the need for improved energy storage solutions and the integration of these technologies into existing power grids.",
"The field of cybersecurity is becoming increasingly important as our reliance on technology and the internet grows. Cyberattacks can have devastating consequences, including the theft of sensitive information, financial loss, and damage to an organization's reputation. Cybercriminals use various methods, such as phishing, ransomware, and malware, to exploit vulnerabilities in systems. Organizations must implement robust cybersecurity measures to protect their data and infrastructure. This includes using encryption, multi-factor authentication, and regular security audits. Additionally, individuals can take steps to safeguard their personal information, such as using strong passwords and being cautious of suspicious emails. As cyber threats continue to evolve, staying informed and vigilant is essential for maintaining cybersecurity.",
"The rise of e-commerce has transformed the retail industry. Online shopping has become increasingly popular, offering consumers convenience and a wide range of products at their fingertips. Major e-commerce platforms like Amazon, Alibaba, and eBay have disrupted traditional brick-and-mortar stores, leading to significant changes in consumer behavior. The COVID-19 pandemic further accelerated the shift to online shopping, as lockdowns and social distancing measures limited in-person shopping. While e-commerce offers many benefits, it also presents challenges, such as the need for efficient logistics and concerns about data privacy. As the industry continues to evolve, companies are exploring new technologies like augmented reality and artificial intelligence to enhance the online shopping experience.",
"Artificial intelligence (AI) is transforming the field of healthcare. AI-powered tools and platforms are being used to analyze medical images, predict patient outcomes, and assist in surgery. In radiology, AI algorithms can help detect abnormalities in medical images, such as tumors or fractures, with high accuracy. In predictive analytics, AI can analyze patient data to identify individuals at risk of developing certain conditions, allowing for early intervention and personalized treatment plans. AI is also being used in robotic surgery, where it can enhance precision and reduce the risk of complications. While AI in healthcare offers many benefits, it also raises questions about data privacy, the role of healthcare professionals, and the need for regulatory oversight.",
"The field of renewable energy is experiencing significant growth as countries seek to reduce their carbon emissions and transition to cleaner energy sources. Solar, wind, and hydro power are among the most common forms of renewable energy, and they offer a sustainable alternative to fossil fuels. Solar power harnesses energy from the sun using photovoltaic cells, while wind power generates electricity through turbines. Hydropower uses the energy of flowing water to produce electricity. These technologies are being adopted at an increasing rate, driven by advancements in technology, falling costs, and supportive government policies. The growth of renewable energy is not without challenges, including the need for improved energy storage solutions and the integration of these technologies into existing power grids.",
"The COVID-19 pandemic has had a profound impact on the world, affecting nearly every aspect of daily life. The pandemic has led to widespread illness, loss of life, and economic disruption. Healthcare systems have been stretched to their limits, and the need for effective treatments and vaccines has become paramount. Scientists and researchers have worked tirelessly to develop vaccines and treatments for COVID-19, leading to the rapid development and distribution of several effective vaccines. The pandemic has also highlighted the importance of public health measures, such as social distancing, mask-wearing, and hand hygiene. As the world continues to grapple with the pandemic, efforts to prevent future outbreaks and improve global health infrastructure are essential.",
"The concept of smart cities is gaining traction as urban areas look for ways to improve efficiency, sustainability, and quality of life for residents. Smart cities leverage technology and data to optimize city services, such as transportation, energy, and waste management. For example, smart traffic management systems can reduce congestion and improve air quality by adjusting traffic signals in real-time based on traffic flow. Smart grids can enhance energy efficiency by balancing supply and demand and integrating renewable energy sources. Additionally, smart waste management systems use sensors to monitor waste levels and optimize collection routes. While smart cities offer many benefits, they also raise concerns about data privacy, cybersecurity, and the need for equitable access to technology.",
"The field of biotechnology is revolutionizing medicine and agriculture. Advances in genetic engineering have enabled scientists to develop crops that are resistant to pests and diseases, as well as produce higher yields. In medicine, biotechnology is being used to create personalized treatments based on an individual's genetic makeup. This approach, known as precision medicine, aims to provide more effective and targeted therapies for various diseases. However, the rapid pace of biotechnological innovation also raises ethical and regulatory questions. It is crucial to balance the benefits of these technologies with the potential risks and ensure that they are used responsibly.",
"The rise of renewable energy is transforming the global energy landscape. Solar, wind, and hydro power are among the most common forms of renewable energy, and they offer a sustainable alternative to fossil fuels. Solar power harnesses energy from the sun using photovoltaic cells, while wind power generates electricity through turbines. Hydropower uses the energy of flowing water to produce electricity. These technologies are being adopted at an increasing rate as countries seek to reduce their carbon emissions and transition to cleaner energy sources. The growth of renewable energy is not without challenges, including the need for improved energy storage solutions and the integration of these technologies into existing power grids.",
"The field of cybersecurity is becoming increasingly important as our reliance on technology and the internet grows. Cyberattacks can have devastating consequences, including the theft of sensitive information, financial loss, and damage to an organization's reputation. Cybercriminals use various methods, such as phishing, ransomware, and malware, to exploit vulnerabilities in systems. Organizations must implement robust cybersecurity measures to protect their data and infrastructure. This includes using encryption, multi-factor authentication, and regular security audits. Additionally, individuals can take steps to safeguard their personal information, such as using strong passwords and being cautious of suspicious emails. As cyber threats continue to evolve, staying informed and vigilant is essential for maintaining cybersecurity.",
"The rise of e-commerce has transformed the retail industry. Online shopping has become increasingly popular, offering consumers convenience and a wide range of products at their fingertips. Major e-commerce platforms like Amazon, Alibaba, and eBay have disrupted traditional brick-and-mortar stores, leading to significant changes in consumer behavior. The COVID-19 pandemic further accelerated the shift to online shopping, as lockdowns and social distancing measures limited in-person shopping. While e-commerce offers many benefits, it also presents challenges, such as the need for efficient logistics and concerns about data privacy. As the industry continues to evolve, companies are exploring new technologies like augmented reality and artificial intelligence to enhance the online shopping experience.",
"Artificial intelligence (AI) is transforming the field of healthcare. AI-powered tools and platforms are being used to analyze medical images, predict patient outcomes, and assist in surgery. In radiology, AI algorithms can help detect abnormalities in medical images, such as tumors or fractures, with high accuracy. In predictive analytics, AI can analyze patient data to identify individuals at risk of developing certain conditions, allowing for early intervention and personalized treatment plans. AI is also being used in robotic surgery, where it can enhance precision and reduce the risk of complications. While AI in healthcare offers many benefits, it also raises questions about data privacy, the role of healthcare professionals, and the need for regulatory oversight.",
]
```

```
text = texts[0]
words_to_add = ["example", "test", "random", "insert"]
num_words_to_add = 5
# modified_text = randomly_add_words(text, words_to_add, num_words_to_add)
modified_text = randomly_add_words(watermark_text(text, offset=0), words_to_add, num_words_to_add)
print("Original Text:")
print(text)
print("\nModified Text:")
print(modified_text)
match_ratios = watermark_text_and_calculate_matches(modified_text, max_offset=5)
print(match_ratios)
check_significant_difference(match_ratios)
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Done 1 th word
Done 6 th word
Done 11 th word
Done 16 th word
Done 21 th word
Done 26 th word
Done 31 th word
Done 36 th word
Done 41 th word
Done 46 th word
Done 51 th word
Done 56 th word
Done 61 th word
Done 66 th word
Done 71 th word
Done 76 th word
Done 81 th word
Done 86 th word
Done 91 th word
Done 96 th word
Original Text:
Artificial intelligence (AI) has seen remarkable advancements in recent years, transforming numerous industries. From healthcare to finance, AI technologies are being leveraged to improve efficiency and decision-making. In healthcare, AI algorithms are being used to analyze medical images, predict patient outcomes, and assist in surgery. Finance professionals are using AI for fraud detection, risk management, and algorithmic trading. Despite these advancements, AI also raises ethical concerns, particularly regarding bias and privacy. Ensuring that AI systems are transparent and fair is critical for their continued adoption and trust. As AI continues to evolve, it is essential to consider both its potential benefits and challenges.
Watermark Areas:
Artificial intelligence (AI) has [MASK] remarkable advancements in recent [MASK] transforming numerous industries. From [MASK] to finance, AI technologies [MASK] being leveraged to improve [MASK] and decision-making. In healthcare, [MASK] algorithms are being used [MASK] analyze medical images, predict [MASK] outcomes, and assist in [MASK] Finance professionals are using [MASK] for fraud detection, risk [MASK] and algorithmic trading. Despite [MASK] advancements, AI also raises [MASK] concerns, particularly regarding bias [MASK] privacy. Ensuring that AI [MASK] are transparent and fair [MASK] critical for their continued [MASK] and trust. As AI [MASK] to evolve, it is [MASK] to consider both its [MASK] benefits and challenges.
Watermarked Text:
Artificial intelligence (AI) has made remarkable advancements in recent years transforming numerous industries. From manufacturing to finance, AI technologies are being leveraged to improve performance and decision-making. In healthcare, ai algorithms are being used to analyze medical images, predict patient outcomes, and assist in how Finance professionals are using them for fraud detection, risk management and algorithmic trading. Despite these advancements, AI also raises ethical concerns, particularly regarding bias and privacy. Ensuring that AI algorithms are transparent and fair is critical for their continued integrity and trust. As AI continues to evolve, it is important to consider both its potential benefits and challenges.
Original Text:
Artificial intelligence (AI) has seen remarkable advancements in recent years, transforming numerous industries. From healthcare to finance, AI technologies are being leveraged to improve efficiency and decision-making. In healthcare, AI algorithms are being used to analyze medical images, predict patient outcomes, and assist in surgery. Finance professionals are using AI for fraud detection, risk management, and algorithmic trading. Despite these advancements, AI also raises ethical concerns, particularly regarding bias and privacy. Ensuring that AI systems are transparent and fair is critical for their continued adoption and trust. As AI continues to evolve, it is essential to consider both its potential benefits and challenges.
Modified Text:
Artificial intelligence (AI) has made remarkable advancements in recent years transforming numerous industries. From manufacturing to finance, AI technologies are being leveraged to improve performance and decision-making. In healthcare, ai algorithms are being used to analyze medical images, predict patient outcomes, random and assist in how Finance professionals are using them for fraud example detection, risk management and algorithmic trading. Despite these advancements, AI also raises ethical concerns, particularly regarding bias and privacy. Ensuring that example AI algorithms are transparent and fair test is critical for their continued integrity and trust. As AI continues to evolve, it is random important to consider both its potential benefits and challenges.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
{0: 0.6190476190476191, 1: 0.3333333333333333, 2: 0.42857142857142855, 3: 0.2857142857142857, 4: 0.55}
Highest Match Ratio: 0.6190476190476191
Average of Other Ratios: 0.3994047619047619
T-Statistic: -3.765894344306259
P-Value: 0.032757613277666235
The highest ratio is significantly different from the others.
```

```
[0.6190476190476191,
0.3994047619047619,
-3.765894344306259,
0.032757613277666235]
```

```
list_of_significance = []
list_of_significance_watermarked = []
count_t = 0
for text in texts:
count_t+=1
print("___________________________________________________________________________________________________________________________")
print("Doing", count_t)
print("___________________________________________________________________________________________________________________________")
words_to_add = ["example", "test", "random", "insert"]
num_words_to_add = 5
# modified_text = randomly_add_words(text, words_to_add, num_words_to_add)
modified_text = randomly_add_words(watermark_text(text, offset=0), words_to_add, num_words_to_add)
# print("Original Text:")
# print(text)
# print("\nModified Text:")
# print(modified_text)
match_ratios = watermark_text_and_calculate_matches(modified_text, max_offset=5)
# print(match_ratios)
list_of_significance_watermarked.append(check_significant_difference(match_ratios))
match_ratios = watermark_text_and_calculate_matches(text, max_offset=5)
list_of_significance.append(check_significant_difference(match_ratios))
print("___________________________________________________________________________________________________________________________")
print("Done", count_t, )
print("___________________________________________________________________________________________________________________________")
```

```
___________________________________________________________________________________________________________________________
Doing 1
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Done 1 th word
Done 6 th word
Done 11 th word
Done 16 th word
Done 21 th word
Done 26 th word
Done 31 th word
Done 36 th word
Done 41 th word
Done 46 th word
Done 51 th word
Done 56 th word
Done 61 th word
Done 66 th word
Done 71 th word
Done 76 th word
Done 81 th word
Done 86 th word
Done 91 th word
Done 96 th word
Original Text:
Artificial intelligence (AI) has seen remarkable advancements in recent years, transforming numerous industries. From healthcare to finance, AI technologies are being leveraged to improve efficiency and decision-making. In healthcare, AI algorithms are being used to analyze medical images, predict patient outcomes, and assist in surgery. Finance professionals are using AI for fraud detection, risk management, and algorithmic trading. Despite these advancements, AI also raises ethical concerns, particularly regarding bias and privacy. Ensuring that AI systems are transparent and fair is critical for their continued adoption and trust. As AI continues to evolve, it is essential to consider both its potential benefits and challenges.
Watermark Areas:
Artificial intelligence (AI) has [MASK] remarkable advancements in recent [MASK] transforming numerous industries. From [MASK] to finance, AI technologies [MASK] being leveraged to improve [MASK] and decision-making. In healthcare, [MASK] algorithms are being used [MASK] analyze medical images, predict [MASK] outcomes, and assist in [MASK] Finance professionals are using [MASK] for fraud detection, risk [MASK] and algorithmic trading. Despite [MASK] advancements, AI also raises [MASK] concerns, particularly regarding bias [MASK] privacy. Ensuring that AI [MASK] are transparent and fair [MASK] critical for their continued [MASK] and trust. As AI [MASK] to evolve, it is [MASK] to consider both its [MASK] benefits and challenges.
Watermarked Text:
Artificial intelligence (AI) has made remarkable advancements in recent years transforming numerous industries. From manufacturing to finance, AI technologies are being leveraged to improve performance and decision-making. In healthcare, ai algorithms are being used to analyze medical images, predict patient outcomes, and assist in how Finance professionals are using them for fraud detection, risk management and algorithmic trading. Despite these advancements, AI also raises ethical concerns, particularly regarding bias and privacy. Ensuring that AI algorithms are transparent and fair is critical for their continued integrity and trust. As AI continues to evolve, it is important to consider both its potential benefits and challenges.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5238095238095238
Average of Other Ratios: 0.4339285714285714
T-Statistic: -16.777777777777764
P-Value: 0.0004610425326555902
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5789473684210527
Average of Other Ratios: 0.4375
T-Statistic: -11.315789473684214
P-Value: 0.0014802633181549077
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 1
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 2
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Done 1 th word
Done 6 th word
Done 11 th word
Done 16 th word
Done 21 th word
Done 26 th word
Done 31 th word
Done 36 th word
Done 41 th word
Done 46 th word
Done 51 th word
Done 56 th word
Done 61 th word
Done 66 th word
Done 71 th word
Done 76 th word
Done 81 th word
Done 86 th word
Done 91 th word
Original Text:
Climate change is one of the most pressing issues facing our planet today. Rising global temperatures, melting ice caps, and increasing frequency of extreme weather events are all indicators of this phenomenon. Scientists warn that without significant action to reduce greenhouse gas emissions, the effects of climate change will become more severe. Renewable energy sources such as solar, wind, and hydro power are being promoted as sustainable alternatives to fossil fuels. Additionally, individuals can make a difference by reducing their carbon footprint through actions like using public transportation, conserving energy, and supporting policies aimed at environmental protection.
Watermark Areas:
Climate change is one [MASK] the most pressing issues [MASK] our planet today. Rising [MASK] temperatures, melting ice caps, [MASK] increasing frequency of extreme [MASK] events are all indicators [MASK] this phenomenon. Scientists warn [MASK] without significant action to [MASK] greenhouse gas emissions, the [MASK] of climate change will [MASK] more severe. Renewable energy [MASK] such as solar, wind, [MASK] hydro power are being [MASK] as sustainable alternatives to [MASK] fuels. Additionally, individuals can [MASK] a difference by reducing [MASK] carbon footprint through actions [MASK] using public transportation, conserving [MASK] and supporting policies aimed [MASK] environmental protection.
Watermarked Text:
Climate change is one of the most pressing issues facing our planet today. Rising global temperatures, melting ice caps, and increasing frequency of extreme weather events are all indicators of this phenomenon. Scientists warn people without significant action to reduce greenhouse gas emissions, the consequences of climate change will be more severe. Renewable energy sources such as solar, wind, and hydro power are being considered as sustainable alternatives to fossil fuels. Additionally, individuals can make a difference by reducing their carbon footprint through actions like using public transportation, conserving water and supporting policies aimed at environmental protection.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.6
Average of Other Ratios: 0.4875
T-Statistic: -6.825454528992923
P-Value: 0.006434193533715181
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.7368421052631579
Average of Other Ratios: 0.4992690058479532
T-Statistic: -4.1686084498315
P-Value: 0.02512588315462554
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 2
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 3
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Done 1 th word
Done 6 th word
Done 11 th word
Done 16 th word
Done 21 th word
Done 26 th word
Done 31 th word
Done 36 th word
Done 41 th word
Done 46 th word
Done 51 th word
Done 56 th word
Done 61 th word
Done 66 th word
Done 71 th word
Done 76 th word
Done 81 th word
Done 86 th word
Done 91 th word
Original Text:
The field of biotechnology is revolutionizing medicine and agriculture. Advances in genetic engineering have enabled scientists to develop crops that are resistant to pests and diseases, as well as produce higher yields. In medicine, biotechnology is being used to create personalized treatments based on an individual's genetic makeup. This approach, known as precision medicine, aims to provide more effective and targeted therapies for various diseases. However, the rapid pace of biotechnological innovation also raises ethical and regulatory questions. It is crucial to balance the benefits of these technologies with the potential risks and ensure that they are used responsibly.
Watermark Areas:
The field of biotechnology [MASK] revolutionizing medicine and agriculture. [MASK] in genetic engineering have [MASK] scientists to develop crops [MASK] are resistant to pests [MASK] diseases, as well as [MASK] higher yields. In medicine, [MASK] is being used to [MASK] personalized treatments based on [MASK] individual's genetic makeup. This [MASK] known as precision medicine, [MASK] to provide more effective [MASK] targeted therapies for various [MASK] However, the rapid pace [MASK] biotechnological innovation also raises [MASK] and regulatory questions. It [MASK] crucial to balance the [MASK] of these technologies with [MASK] potential risks and ensure [MASK] they are used responsibly.
Watermarked Text:
The field of biotechnology is revolutionizing medicine and agriculture. advances in genetic engineering have allowed scientists to develop crops that are resistant to pests and diseases, as well as achieve higher yields. In medicine, biotechnology is being used to develop personalized treatments based on an individual's genetic makeup. This is known as precision medicine, designed to provide more effective and targeted therapies for various diseases However, the rapid pace of biotechnological innovation also raises ethical and regulatory questions. It is crucial to balance the benefits of these technologies with the potential risks and ensure that they are used responsibly.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
C:\Users\rrath\.conda\envs\py310\lib\site-packages\scipy\stats\_axis_nan_policy.py:523: RuntimeWarning: Precision loss occurred in moment calculation due to catastrophic cancellation. This occurs when the data are nearly identical. Results may be unreliable.
res = hypotest_fun_out(*samples, **kwds)
```

```
Highest Match Ratio: 0.55
Average of Other Ratios: 0.5
T-Statistic: -inf
P-Value: 0.0
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.631578947368421
Average of Other Ratios: 0.47368421052631576
T-Statistic: -2.5980762113533156
P-Value: 0.12168993434632014
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 3
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 4
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Done 1 th word
Done 6 th word
Done 11 th word
Done 16 th word
Done 21 th word
Done 26 th word
Done 31 th word
Done 36 th word
Done 41 th word
Done 46 th word
Done 51 th word
Done 56 th word
Done 61 th word
Done 66 th word
Done 71 th word
Done 76 th word
Done 81 th word
Done 86 th word
Done 91 th word
Original Text:
Quantum computing is poised to revolutionize the world of computing. Unlike classical computers, which use bits to represent data as 0s and 1s, quantum computers use qubits, which can exist in multiple states simultaneously. This allows quantum computers to perform complex calculations much faster than their classical counterparts. Potential applications of quantum computing include cryptography, drug discovery, and optimization problems. However, building a practical and scalable quantum computer remains a significant challenge. Researchers are exploring various approaches, such as superconducting qubits and trapped ions, to overcome these hurdles and bring quantum computing closer to reality.
Watermark Areas:
Quantum computing is poised [MASK] revolutionize the world of [MASK] Unlike classical computers, which [MASK] bits to represent data [MASK] 0s and 1s, quantum [MASK] use qubits, which can [MASK] in multiple states simultaneously. [MASK] allows quantum computers to [MASK] complex calculations much faster [MASK] their classical counterparts. Potential [MASK] of quantum computing include [MASK] drug discovery, and optimization [MASK] However, building a practical [MASK] scalable quantum computer remains [MASK] significant challenge. Researchers are [MASK] various approaches, such as [MASK] qubits and trapped ions, [MASK] overcome these hurdles and [MASK] quantum computing closer to [MASK]
Watermarked Text:
Quantum computing is poised to revolutionize the world of computing Unlike classical computers, which use bits to represent data between 0s and 1s, quantum computers use qubits, which can exist in multiple states simultaneously. this allows quantum computers to perform complex calculations much faster than their classical counterparts. Potential applications of quantum computing include : drug discovery, and optimization . However, building a practical and scalable quantum computer remains a significant challenge. Researchers are exploring various approaches, such as trapped qubits and trapped ions, to overcome these hurdles and bring quantum computing closer to .
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5789473684210527
Average of Other Ratios: 0.46578947368421053
T-Statistic: -14.333333333333357
P-Value: 0.004832243042167172
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.6666666666666666
Average of Other Ratios: 0.5051169590643274
T-Statistic: -3.25528426992502
P-Value: 0.047299956469803
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 4
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 5
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Done 1 th word
Done 6 th word
Done 11 th word
Done 16 th word
Done 21 th word
Done 26 th word
Done 31 th word
Done 36 th word
Done 41 th word
Done 46 th word
Done 51 th word
Done 56 th word
Done 61 th word
Done 66 th word
Done 71 th word
Done 76 th word
Done 81 th word
Done 86 th word
Done 91 th word
Done 96 th word
Done 101 th word
Done 106 th word
Original Text:
The internet of things (IoT) is transforming the way we interact with the world around us. IoT refers to the network of interconnected devices that collect and exchange data. These devices range from smart home appliances to industrial sensors, and their applications are vast. In the home, IoT devices can automate tasks like adjusting the thermostat, turning off lights, and monitoring security systems. In industry, IoT is used to optimize supply chains, monitor equipment health, and improve safety. However, the proliferation of IoT devices also raises concerns about security and privacy. Ensuring that these devices are secure and that data is protected is essential for the continued growth of IoT.
Watermark Areas:
The internet of things [MASK] is transforming the way [MASK] interact with the world [MASK] us. IoT refers to [MASK] network of interconnected devices [MASK] collect and exchange data. [MASK] devices range from smart [MASK] appliances to industrial sensors, [MASK] their applications are vast. [MASK] the home, IoT devices [MASK] automate tasks like adjusting [MASK] thermostat, turning off lights, [MASK] monitoring security systems. In [MASK] IoT is used to [MASK] supply chains, monitor equipment [MASK] and improve safety. However, [MASK] proliferation of IoT devices [MASK] raises concerns about security [MASK] privacy. Ensuring that these [MASK] are secure and that [MASK] is protected is essential [MASK] the continued growth of [MASK]
Watermarked Text:
The internet of things , is transforming the way we interact with the world around us. IoT refers to a network of interconnected devices that collect and exchange data. these devices range from smart home appliances to industrial sensors, and their applications are vast. in the home, IoT devices can automate tasks like adjusting the thermostat, turning off lights, and monitoring security systems. In business IoT is used to monitor supply chains, monitor equipment , and improve safety. However, the proliferation of IoT devices also raises concerns about security and privacy. Ensuring that these devices are secure and that everyone is protected is essential to the continued growth of .
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.6956521739130435
Average of Other Ratios: 0.4204545454545455
T-Statistic: -4.604144446201748
P-Value: 0.019265544435432887
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5454545454545454
Average of Other Ratios: 0.44047619047619047
T-Statistic: -1.9884454013529376
P-Value: 0.1408964116764383
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 5
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 6
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Done 1 th word
Done 6 th word
Done 11 th word
Done 16 th word
Done 21 th word
Done 26 th word
Done 31 th word
Done 36 th word
Done 41 th word
Done 46 th word
Done 51 th word
Done 56 th word
Done 61 th word
Done 66 th word
Done 71 th word
Done 76 th word
Done 81 th word
Done 86 th word
Done 91 th word
Done 96 th word
Done 101 th word
Done 106 th word
Original Text:
Renewable energy is gaining momentum as a viable solution to the world's energy needs. Solar, wind, and hydro power are among the most common forms of renewable energy, and they offer a sustainable alternative to fossil fuels. Solar power harnesses energy from the sun using photovoltaic cells, while wind power generates electricity through turbines. Hydropower uses the energy of flowing water to produce electricity. These technologies are being adopted at an increasing rate as countries seek to reduce their carbon emissions and transition to cleaner energy sources. The growth of renewable energy is not without challenges, including the need for improved energy storage solutions and the integration of these technologies into existing power grids.
Watermark Areas:
Renewable energy is gaining [MASK] as a viable solution [MASK] the world's energy needs. [MASK] wind, and hydro power [MASK] among the most common [MASK] of renewable energy, and [MASK] offer a sustainable alternative [MASK] fossil fuels. Solar power [MASK] energy from the sun [MASK] photovoltaic cells, while wind [MASK] generates electricity through turbines. [MASK] uses the energy of [MASK] water to produce electricity. [MASK] technologies are being adopted [MASK] an increasing rate as [MASK] seek to reduce their [MASK] emissions and transition to [MASK] energy sources. The growth [MASK] renewable energy is not [MASK] challenges, including the need [MASK] improved energy storage solutions [MASK] the integration of these [MASK] into existing power grids.
Watermarked Text:
Renewable energy is gaining recognition as a viable solution to the world's energy needs. solar wind, and hydro power are among the most common forms of renewable energy, and they offer a sustainable alternative to fossil fuels. Solar power generates energy from the sun through photovoltaic cells, while wind power generates electricity through turbines. hydro uses the energy of drinking water to produce electricity. new technologies are being adopted at an increasing rate as countries seek to reduce their carbon emissions and transition to renewable energy sources. The growth of renewable energy is not without challenges, including the need for improved energy storage solutions and the integration of these technologies into existing power grids.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.782608695652174
Average of Other Ratios: 0.4891304347826087
T-Statistic: -5.7132994884546475
P-Value: 0.010638289400887542
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.7272727272727273
Average of Other Ratios: 0.5454545454545454
T-Statistic: -2.529822128134705
P-Value: 0.08543743614799877
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 6
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 7
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Done 1 th word
Done 6 th word
Done 11 th word
Done 16 th word
Done 21 th word
Done 26 th word
Done 31 th word
Done 36 th word
Done 41 th word
Done 46 th word
Done 51 th word
Done 56 th word
Done 61 th word
Done 66 th word
Done 71 th word
Done 76 th word
Done 81 th word
Done 86 th word
Done 91 th word
Done 96 th word
Done 101 th word
Done 106 th word
Original Text:
The rise of e-commerce has transformed the retail industry. Online shopping has become increasingly popular, offering consumers convenience and a wide range of products at their fingertips. Major e-commerce platforms like Amazon, Alibaba, and eBay have disrupted traditional brick-and-mortar stores, leading to significant changes in consumer behavior. The COVID-19 pandemic further accelerated the shift to online shopping, as lockdowns and social distancing measures limited in-person shopping. While e-commerce offers many benefits, it also presents challenges, such as the need for efficient logistics and concerns about data privacy. As the industry continues to evolve, companies are exploring new technologies like augmented reality and artificial intelligence to enhance the online shopping experience.
Watermark Areas:
The rise of e-commerce [MASK] transformed the retail industry. [MASK] shopping has become increasingly [MASK] offering consumers convenience and [MASK] wide range of products [MASK] their fingertips. Major e-commerce [MASK] like Amazon, Alibaba, and [MASK] have disrupted traditional brick-and-mortar [MASK] leading to significant changes [MASK] consumer behavior. The COVID-19 [MASK] further accelerated the shift [MASK] online shopping, as lockdowns [MASK] social distancing measures limited [MASK] shopping. While e-commerce offers [MASK] benefits, it also presents [MASK] such as the need [MASK] efficient logistics and concerns [MASK] data privacy. As the [MASK] continues to evolve, companies [MASK] exploring new technologies like [MASK] reality and artificial intelligence [MASK] enhance the online shopping [MASK]
Watermarked Text:
The rise of e-commerce has transformed the retail industry. online shopping has become increasingly popular offering consumers convenience and a wide range of products at their fingertips. Major e-commerce companies like Amazon, Alibaba, and others have disrupted traditional brick-and-mortar shopping leading to significant changes in consumer behavior. The COVID-19 has further accelerated the shift towards online shopping, as lockdowns and social distancing measures limited online shopping. While e-commerce offers many benefits, it also presents challenges such as the need for efficient logistics and concerns about data privacy. As the internet continues to evolve, companies are exploring new technologies like augmented reality and artificial intelligence to enhance the online shopping .
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5454545454545454
Average of Other Ratios: 0.4599802371541502
T-Statistic: -2.439848527409759
P-Value: 0.0925127409364643
The highest ratio is not significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

Fetching long content....

```
[[0.5789473684210527, 0.4375, -11.315789473684214, 0.0014802633181549077],
[0.7368421052631579,
0.4992690058479532,
-4.1686084498315,
0.02512588315462554],
[0.631578947368421,
0.47368421052631576,
-2.5980762113533156,
0.12168993434632014],
[0.6666666666666666,
0.5051169590643274,
-3.25528426992502,
0.047299956469803],
[0.5454545454545454,
0.44047619047619047,
-1.9884454013529376,
0.1408964116764383],
[0.7272727272727273,
0.5454545454545454,
-2.529822128134705,
0.08543743614799877],
[0.5, 0.4285714285714286, -3.674234614174766, 0.034896984510150934],
[0.45, 0.36140350877192984, -1.789925042646048, 0.21535497619213528],
[0.6363636363636364,
0.5524891774891776,
-4.925394256602069,
0.01603915968463389],
[0.6363636363636364,
0.5004940711462451,
-2.8968775241076448,
0.0626611732957653],
[0.75, 0.5776315789473684, -4.168368422873468, 0.02512970789136552],
[0.6818181818181818,
0.5568181818181819,
-3.666666666666662,
0.03508151471548204],
[0.7391304347826086,
0.42984189723320154,
-7.289590560310877,
0.005329596912408047],
[0.782608695652174,
0.4936594202898551,
-7.972508980104777,
0.004117361652430399],
[0.6363636363636364,
0.5113636363636364,
-3.22047024073016,
0.04856685655980099],
[0.6521739130434783,
0.5434782608695652,
-2.3797114365109158,
0.09764327274027122],
[0.631578947368421,
0.47368421052631576,
-2.5980762113533156,
0.12168993434632014],
[0.7142857142857143,
0.5367965367965368,
-2.3442928638434024,
0.10082728660926546],
[0.47619047619047616,
0.38095238095238093,
-3.4641016151377544,
0.07417990022744853],
[0.5, 0.4285714285714286, -3.674234614174766, 0.034896984510150934],
[0.5454545454545454,
0.4631093544137022,
-4.2649449620933755,
0.05082148124684452],
[0.782608695652174,
0.4936594202898551,
-7.972508980104777,
0.004117361652430399],
[0.6363636363636364,
0.5113636363636364,
-3.22047024073016,
0.04856685655980099],
[0.6521739130434783,
0.5434782608695652,
-2.3797114365109158,
0.09764327274027122],
[0.631578947368421,
0.47368421052631576,
-2.5980762113533156,
0.12168993434632014],
[0.7142857142857143,
0.5367965367965368,
-2.3442928638434024,
0.10082728660926546],
[0.47619047619047616,
0.38095238095238093,
-3.4641016151377544,
0.07417990022744853],
[0.5, 0.4285714285714286, -3.674234614174766, 0.034896984510150934],
[0.5454545454545454,
0.4631093544137022,
-4.2649449620933755,
0.05082148124684452]]
```

```
print(f"{'Highest Ratio':<20} {'Average Others':<20} {'T-Statistic':<20} {'P-Value':<20} || {'Highest Ratio':<20} {'Average Others':<20} {'T-Statistic':<20} {'P-Value':<20}")
# Print each pair of lists side by side
for sig, wm_sig in zip(list_of_significance, list_of_significance_watermarked):
print(f"{sig[0]:<20} {sig[1]:<20} {sig[2]:<20} {sig[3]:<20} || {wm_sig[0]:<20} {wm_sig[1]:<20} {wm_sig[2]:<20} {wm_sig[3]:<20}")
```

```
Highest Ratio Average Others T-Statistic P-Value || Highest Ratio Average Others T-Statistic P-Value
0.5789473684210527 0.4375 -11.315789473684214 0.0014802633181549077 || 0.65 0.41666666666666663 -5.17705132919467 0.013988180239752648
0.7368421052631579 0.4992690058479532 -4.1686084498315 0.02512588315462554 || 0.631578947368421 0.5072368421052631 -3.7039840906304633 0.034183520845761046
0.631578947368421 0.47368421052631576 -2.5980762113533156 0.12168993434632014 || 0.8 0.44999999999999996 -9.899494936611665 0.002192318898657741
0.6666666666666666 0.5051169590643274 -3.25528426992502 0.047299956469803 || 0.631578947368421 0.5052631578947369 -3.3070695276573017 0.04549183755402306
0.5454545454545454 0.44047619047619047 -1.9884454013529376 0.1408964116764383 || 0.6363636363636364 0.3932806324110672 -5.671556095740365 0.010858631561421467
0.7272727272727273 0.5454545454545454 -2.529822128134705 0.08543743614799877 || 0.782608695652174 0.5 -3.0929011843007626 0.0535919356301439
0.5 0.4285714285714286 -3.674234614174766 0.034896984510150934 || 0.5454545454545454 0.39377470355731226 -3.7778595133554176 0.032490871457917674
0.45 0.36140350877192984 -1.789925042646048 0.21535497619213528 || 0.5714285714285714 0.407936507936508 -2.525754294555235 0.12746322930311096
0.6363636363636364 0.5524891774891776 -4.925394256602069 0.01603915968463389 || 0.7272727272727273 0.5093873517786561 -4.676780667650381 0.018467037431746196
0.6363636363636364 0.5004940711462451 -2.8968775241076448 0.0626611732957653 || 0.7083333333333334 0.5040760869565217 -3.3948179538648735 0.042623438183825496
0.75 0.5776315789473684 -4.168368422873468 0.02512970789136552 || 0.6190476190476191 0.5369047619047619 -3.7736294416002862 0.03258485403885965
0.6818181818181818 0.5568181818181819 -3.666666666666662 0.03508151471548204 || 0.6956521739130435 0.532608695652174 -3.382407126012729 0.043014906734981546
0.7391304347826086 0.42984189723320154 -7.289590560310877 0.005329596912408047 || 0.7083333333333334 0.4433876811594203 -4.490136077665652 0.02061159932091642
0.782608695652174 0.4936594202898551 -7.972508980104777 0.004117361652430399 || 0.6666666666666666 0.5354166666666667 -5.5468407098514305 0.011553575011403559
0.6363636363636364 0.5113636363636364 -3.22047024073016 0.04856685655980099 || 0.6086956521739131 0.47826086956521735 -4.242640687119289 0.023981199790656615
0.6521739130434783 0.5434782608695652 -2.3797114365109158 0.09764327274027122 || 0.8333333333333334 0.46557971014492755 -4.234837745291732 0.02409863068609194
0.631578947368421 0.47368421052631576 -2.5980762113533156 0.12168993434632014 || 0.6 0.48333333333333334 -3.4999999999999987 0.07282735005446936
0.7142857142857143 0.5367965367965368 -2.3442928638434024 0.10082728660926546 || 0.782608695652174 0.549901185770751 -4.594812178568088 0.01937136230021868
0.47619047619047616 0.38095238095238093 -3.4641016151377544 0.07417990022744853 || 0.6363636363636364 0.4090909090909091 -4.08248290463863 0.026547885467199484
0.5 0.4285714285714286 -3.674234614174766 0.034896984510150934 || 0.7272727272727273 0.42539525691699603 -4.9868551538544414 0.015503886330756058
0.5454545454545454 0.4631093544137022 -4.2649449620933755 0.05082148124684452 || 0.6956521739130435 0.44157608695652173 -7.251548965980652 0.0054102533801680865
0.782608695652174 0.4936594202898551 -7.972508980104777 0.004117361652430399 || 0.625 0.49222222222222217 -3.9180327868852447 0.05939767081769266
0.6363636363636364 0.5113636363636364 -3.22047024073016 0.04856685655980099 || 0.5217391304347826 0.44565217391304346 -3.6556307750696546 0.03535284700251738
0.6521739130434783 0.5434782608695652 -2.3797114365109158 0.09764327274027122 || 0.6666666666666666 0.5 -13.279056191361398 0.005623287315631082
0.631578947368421 0.47368421052631576 -2.5980762113533156 0.12168993434632014 || 0.7 0.475 -8.999999999999995 0.0028958121618641495
0.7142857142857143 0.5367965367965368 -2.3442928638434024 0.10082728660926546 || 0.7391304347826086 0.5602766798418972 -3.356266857779692 0.04385449037496923
0.47619047619047616 0.38095238095238093 -3.4641016151377544 0.07417990022744853 || 0.5 0.38636363636363635 -2.611164839335468 0.07960498081790623
0.5 0.4285714285714286 -3.674234614174766 0.034896984510150934 || 0.6086956521739131 0.44318181818181823 -4.855072463768116 0.01668150816820796
0.5454545454545454 0.4631093544137022 -4.2649449620933755 0.05082148124684452 || 0.6666666666666666 0.4673913043478261 -2.77438299767925 0.0693145043773778
```

```
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from scipy.stats import ttest_ind
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import classification_report, confusion_matrix
# Assuming list_of_significance and list_of_significance_watermarked are already defined
# Create DataFrames from the lists
df_significance = pd.DataFrame(list_of_significance, columns=['Highest Ratio', 'Average Others', 'T-Statistic', 'P-Value'])
df_significance_watermarked = pd.DataFrame(list_of_significance_watermarked, columns=['Highest Ratio', 'Average Others', 'T-Statistic', 'P-Value'])
# Add a label column to distinguish between the two sets
df_significance['Label'] = 'Original'
df_significance_watermarked['Label'] = 'Watermarked'
# Combine the DataFrames
combined_df = pd.concat([df_significance, df_significance_watermarked], ignore_index=True)
# Perform EDA
def perform_eda(df):
# Display the first few rows of the DataFrame
print("First few rows of the DataFrame:")
print(df.head())
# Display statistical summary
print("\nStatistical Summary:")
print(df.describe())
# Check for missing values
print("\nMissing Values:")
print(df.isnull().sum())
# Visualize the distributions of the features
plt.figure(figsize=(12, 8))
sns.histplot(data=df, x='Highest Ratio', hue='Label', element='step', kde=True)
plt.title('Distribution of Highest Ratio')
plt.show()
plt.figure(figsize=(12, 8))
sns.histplot(data=df, x='Average Others', hue='Label', element='step', kde=True)
plt.title('Distribution of Average Others')
plt.show()
plt.figure(figsize=(12, 8))
sns.histplot(data=df, x='T-Statistic', hue='Label', element='step', kde=True)
plt.title('Distribution of T-Statistic')
plt.show()
plt.figure(figsize=(12, 8))
sns.histplot(data=df, x='P-Value', hue='Label', element='step', kde=True)
plt.title('Distribution of P-Value')
plt.show()
# Pairplot to see relationships
sns.pairplot(df, hue='Label')
plt.show()
# Correlation matrix
plt.figure(figsize=(10, 8))
sns.heatmap(df.drop(columns=['Label']).corr(), annot=True, cmap='coolwarm')
plt.title('Correlation Matrix')
plt.show()
# T-test to check for significant differences
original = df[df['Label'] == 'Original']
watermarked = df[df['Label'] == 'Watermarked']
for column in ['Highest Ratio', 'Average Others', 'T-Statistic', 'P-Value']:
t_stat, p_value = ttest_ind(original[column], watermarked[column])
print(f"T-test for {column}: T-Statistic = {t_stat}, P-Value = {p_value}")
# Perform EDA on the combined DataFrame
perform_eda(combined_df)
# Check if the data is ready for machine learning classification
# Prepare the data
X = combined_df.drop(columns=['Label'])
y = combined_df['Label']
# Convert labels to numerical values for ML model
y = y.map({'Original': 0, 'Watermarked': 1})
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Train a RandomForestClassifier
clf = RandomForestClassifier(random_state=42)
clf.fit(X_train, y_train)
# Make predictions
y_pred = clf.predict(X_test)
# Evaluate the model
print("\nClassification Report:")
print(classification_report(y_test, y_pred))
print("\nConfusion Matrix:")
print(confusion_matrix(y_test, y_pred))
# Feature importances
feature_importances = clf.feature_importances_
# Create a DataFrame for feature importances
feature_importances_df = pd.DataFrame({
'Feature': X.columns,
'Importance': feature_importances
}).sort_values(by='Importance', ascending=False)
# Plot feature importances
plt.figure(figsize=(12, 8))
sns.barplot(x='Importance', y='Feature', data=feature_importances_df, palette='viridis')
plt.title('Feature Importances')
plt.show()
# Heatmap for feature importances
plt.figure(figsize=(10, 8))
sns.heatmap(feature_importances_df.set_index('Feature').T, annot=True, cmap='viridis')
plt.title('Heatmap of Feature Importances')
plt.show()
```

```
First few rows of the DataFrame:
Highest Ratio Average Others T-Statistic P-Value Label
0 0.233333 0.182203 -3.532758 0.038563 Original
1 0.203390 0.139195 -3.440591 0.041218 Original
2 0.338983 0.270339 -2.228608 0.112142 Original
3 0.254237 0.168362 -2.451613 0.246559 Original
4 0.288136 0.210876 -5.467540 0.012026 Original
Statistical Summary:
Highest Ratio Average Others T-Statistic P-Value
count 4000.000000 4000.000000 3999.000000 3999.000000
mean 0.490285 0.339968 -6.076672 0.036783
std 0.128376 0.082900 5.580957 0.043217
min 0.101695 0.066667 -111.524590 0.000002
25% 0.416667 0.296610 -6.938964 0.006418
50% 0.491525 0.354732 -4.431515 0.021973
75% 0.573770 0.398224 -3.176861 0.052069
max 0.868852 0.580601 -1.166065 0.451288
Missing Values:
Highest Ratio 0
Average Others 0
T-Statistic 1
P-Value 1
Label 0
dtype: int64
```

```
T-test for Highest Ratio: T-Statistic = -57.59965843801415, P-Value = 0.0
T-test for Average Others: T-Statistic = -21.080776226637518, P-Value = 1.2478046488137352e-93
T-test for T-Statistic: T-Statistic = nan, P-Value = nan
T-test for P-Value: T-Statistic = nan, P-Value = nan
```

```
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[46], line 91
89 # Train a RandomForestClassifier
90 clf = RandomForestClassifier(random_state=42)
---> 91 clf.fit(X_train, y_train)
93 # Make predictions
94 y_pred = clf.predict(X_test)
File ~\.conda\envs\py310\lib\site-packages\sklearn\base.py:1152, in _fit_context.<locals>.decorator.<locals>.wrapper(estimator, *args, **kwargs)
1145 estimator._validate_params()
1147 with config_context(
1148 skip_parameter_validation=(
1149 prefer_skip_nested_validation or global_skip_validation
1150 )
1151 ):
-> 1152 return fit_method(estimator, *args, **kwargs)
File ~\.conda\envs\py310\lib\site-packages\sklearn\ensemble\_forest.py:348, in BaseForest.fit(self, X, y, sample_weight)
346 if issparse(y):
347 raise ValueError("sparse multilabel-indicator for y is not supported.")
--> 348 X, y = self._validate_data(
349 X, y, multi_output=True, accept_sparse="csc", dtype=DTYPE
350 )
351 if sample_weight is not None:
352 sample_weight = _check_sample_weight(sample_weight, X)
File ~\.conda\envs\py310\lib\site-packages\sklearn\base.py:622, in BaseEstimator._validate_data(self, X, y, reset, validate_separately, cast_to_ndarray, **check_params)
620 y = check_array(y, input_name="y", **check_y_params)
621 else:
--> 622 X, y = check_X_y(X, y, **check_params)
623 out = X, y
625 if not no_val_X and check_params.get("ensure_2d", True):
File ~\.conda\envs\py310\lib\site-packages\sklearn\utils\validation.py:1146, in check_X_y(X, y, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, multi_output, ensure_min_samples, ensure_min_features, y_numeric, estimator)
1141 estimator_name = _check_estimator_name(estimator)
1142 raise ValueError(
1143 f"{estimator_name} requires y to be passed, but the target y is None"
1144 )
-> 1146 X = check_array(
1147 X,
1148 accept_sparse=accept_sparse,
1149 accept_large_sparse=accept_large_sparse,
1150 dtype=dtype,
1151 order=order,
1152 copy=copy,
1153 force_all_finite=force_all_finite,
1154 ensure_2d=ensure_2d,
1155 allow_nd=allow_nd,
1156 ensure_min_samples=ensure_min_samples,
1157 ensure_min_features=ensure_min_features,
1158 estimator=estimator,
1159 input_name="X",
1160 )
1162 y = _check_y(y, multi_output=multi_output, y_numeric=y_numeric, estimator=estimator)
1164 check_consistent_length(X, y)
File ~\.conda\envs\py310\lib\site-packages\sklearn\utils\validation.py:957, in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, estimator, input_name)
951 raise ValueError(
952 "Found array with dim %d. %s expected <= 2."
953 % (array.ndim, estimator_name)
954 )
956 if force_all_finite:
--> 957 _assert_all_finite(
958 array,
959 input_name=input_name,
960 estimator_name=estimator_name,
961 allow_nan=force_all_finite == "allow-nan",
962 )
964 if ensure_min_samples > 0:
965 n_samples = _num_samples(array)
File ~\.conda\envs\py310\lib\site-packages\sklearn\utils\validation.py:122, in _assert_all_finite(X, allow_nan, msg_dtype, estimator_name, input_name)
119 if first_pass_isfinite:
120 return
--> 122 _assert_all_finite_element_wise(
123 X,
124 xp=xp,
125 allow_nan=allow_nan,
126 msg_dtype=msg_dtype,
127 estimator_name=estimator_name,
128 input_name=input_name,
129 )
File ~\.conda\envs\py310\lib\site-packages\sklearn\utils\validation.py:171, in _assert_all_finite_element_wise(X, xp, allow_nan, msg_dtype, estimator_name, input_name)
154 if estimator_name and input_name == "X" and has_nan_error:
155 # Improve the error message on how to handle missing values in
156 # scikit-learn.
157 msg_err += (
158 f"\n{estimator_name} does not accept missing values"
159 " encoded as NaN natively. For supervised learning, you might want"
(...)
169 "#estimators-that-handle-nan-values"
170 )
--> 171 raise ValueError(msg_err)
ValueError: Input X contains NaN.
RandomForestClassifier does not accept missing values encoded as NaN natively. For supervised learning, you might want to consider sklearn.ensemble.HistGradientBoostingClassifier and Regressor which accept missing values encoded as NaNs natively. Alternatively, it is possible to preprocess the data, for instance by using an imputer transformer in a pipeline or drop samples with missing values. See https://scikit-learn.org/stable/modules/impute.html You can find a list of all estimators that handle NaN values at the following page: https://scikit-learn.org/stable/modules/impute.html#estimators-that-handle-nan-values
```

```
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier, AdaBoostClassifier
from sklearn.svm import SVC
from sklearn.naive_bayes import GaussianNB
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import classification_report, confusion_matrix
# Assuming list_of_significance and list_of_significance_watermarked are already defined
# Create DataFrames from the lists
df_significance = pd.DataFrame(list_of_significance, columns=['Highest Ratio', 'Average Others', 'T-Statistic', 'P-Value'])
df_significance_watermarked = pd.DataFrame(list_of_significance_watermarked, columns=['Highest Ratio', 'Average Others', 'T-Statistic', 'P-Value'])
# Add a label column to distinguish between the two sets
df_significance['Label'] = 'Original'
df_significance_watermarked['Label'] = 'Watermarked'
# Combine the DataFrames
combined_df = pd.concat([df_significance, df_significance_watermarked], ignore_index=True)
combined_df = combined_df.dropna()
# Prepare the data
X = combined_df.drop(columns=['Label'])
y = combined_df['Label']
# Convert labels to numerical values for ML model
y = y.map({'Original': 0, 'Watermarked': 1})
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Initialize models
models = {
'Logistic Regression': LogisticRegression(random_state=42, max_iter=1000),
'Decision Tree': DecisionTreeClassifier(random_state=42),
'Random Forest': RandomForestClassifier(random_state=42),
'Support Vector Machine': SVC(random_state=42),
'Gradient Boosting': GradientBoostingClassifier(random_state=42),
'AdaBoost': AdaBoostClassifier(random_state=42),
'Naive Bayes': GaussianNB(),
'K-Nearest Neighbors': KNeighborsClassifier()
}
# Train and evaluate models
for model_name, model in models.items():
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
print(f"\n{model_name} Classification Report:")
print(classification_report(y_test, y_pred))
print(f"\n{model_name} Confusion Matrix:")
print(confusion_matrix(y_test, y_pred))
# Feature importances (only for models that provide it)
if hasattr(model, 'feature_importances_'):
feature_importances = model.feature_importances_
feature_importances_df = pd.DataFrame({
'Feature': X.columns,
'Importance': feature_importances
}).sort_values(by='Importance', ascending=False)
# Plot feature importances
# plt.figure(figsize=(12, 8))
# sns.barplot(x='Importance', y='Feature', data=feature_importances_df, palette='viridis')
# plt.title(f'{model_name} Feature Importances')
plt.show()
```

```
Logistic Regression Classification Report:
precision recall f1-score support
0 0.83 0.87 0.85 415
1 0.85 0.81 0.83 385
accuracy 0.84 800
macro avg 0.84 0.84 0.84 800
weighted avg 0.84 0.84 0.84 800
Logistic Regression Confusion Matrix:
[[360 55]
[ 73 312]]
Decision Tree Classification Report:
precision recall f1-score support
0 0.92 0.92 0.92 415
1 0.91 0.91 0.91 385
accuracy 0.91 800
macro avg 0.91 0.91 0.91 800
weighted avg 0.91 0.91 0.91 800
Decision Tree Confusion Matrix:
[[380 35]
[ 35 350]]
Random Forest Classification Report:
precision recall f1-score support
0 0.91 0.94 0.93 415
1 0.94 0.90 0.92 385
accuracy 0.92 800
macro avg 0.92 0.92 0.92 800
weighted avg 0.92 0.92 0.92 800
Random Forest Confusion Matrix:
[[391 24]
[ 39 346]]
Support Vector Machine Classification Report:
precision recall f1-score support
0 0.71 0.79 0.75 415
1 0.74 0.66 0.70 385
accuracy 0.73 800
macro avg 0.73 0.72 0.72 800
weighted avg 0.73 0.72 0.72 800
Support Vector Machine Confusion Matrix:
[[327 88]
[132 253]]
Gradient Boosting Classification Report:
precision recall f1-score support
0 0.93 0.94 0.94 415
1 0.94 0.92 0.93 385
accuracy 0.94 800
macro avg 0.94 0.93 0.93 800
weighted avg 0.94 0.94 0.93 800
Gradient Boosting Confusion Matrix:
[[392 23]
[ 29 356]]
AdaBoost Classification Report:
precision recall f1-score support
0 0.90 0.91 0.90 415
1 0.90 0.89 0.89 385
accuracy 0.90 800
macro avg 0.90 0.90 0.90 800
weighted avg 0.90 0.90 0.90 800
AdaBoost Confusion Matrix:
[[376 39]
[ 44 341]]
Naive Bayes Classification Report:
precision recall f1-score support
0 0.78 0.81 0.79 415
1 0.78 0.76 0.77 385
accuracy 0.78 800
macro avg 0.78 0.78 0.78 800
weighted avg 0.78 0.78 0.78 800
Naive Bayes Confusion Matrix:
[[335 80]
[ 94 291]]
K-Nearest Neighbors Classification Report:
precision recall f1-score support
0 0.82 0.87 0.84 415
1 0.85 0.79 0.82 385
accuracy 0.83 800
macro avg 0.83 0.83 0.83 800
weighted avg 0.83 0.83 0.83 800
K-Nearest Neighbors Confusion Matrix:
[[361 54]
[ 81 304]]
```

```
import os
import random
def extract_test_cases(folder_path, num_cases=2000, words_per_case=300):
test_cases = []
book_files = [f for f in os.listdir(folder_path) if os.path.isfile(os.path.join(folder_path, f))]
# Calculate the number of test cases to extract from each book
cases_per_book = num_cases // len(book_files)
extra_cases = num_cases % len(book_files)
for book_file in book_files:
with open(os.path.join(folder_path, book_file), 'r', encoding='utf-8') as file:
text = file.read()
words = text.split()
num_words = len(words)
# Ensure enough words are available to extract the cases
if num_words < words_per_case:
continue
# Determine the number of cases to extract from this book
num_cases_from_book = cases_per_book
if extra_cases > 0:
num_cases_from_book += 1
extra_cases -= 1
for _ in range(num_cases_from_book):
start_index = random.randint(0, num_words - words_per_case)
case = ' '.join(words[start_index:start_index + words_per_case])
test_cases.append(case)
if len(test_cases) == num_cases:
return test_cases
return test_cases
# Usage example
folder_path = 'books'
test_cases = extract_test_cases(folder_path)
# Output the number of test cases created
print(f"Number of test cases created: {len(test_cases)}")
```

```
Number of test cases created: 2000
```

```
list_of_significance = []
list_of_significance_watermarked = []
count_t = 0
for text in test_cases:
count_t+=1
print("___________________________________________________________________________________________________________________________")
print("Doing", count_t)
print("___________________________________________________________________________________________________________________________")
words_to_add = ["example", "test", "random", "insert"]
num_words_to_add = 5
# modified_text = randomly_add_words(text, words_to_add, num_words_to_add)
modified_text = randomly_add_words(watermark_text(text, offset=0), words_to_add, num_words_to_add)
# print("Original Text:")
# print(text)
# print("\nModified Text:")
# print(modified_text)
match_ratios = watermark_text_and_calculate_matches(modified_text, max_offset=5)
# print(match_ratios)
list_of_significance_watermarked.append(check_significant_difference(match_ratios))
match_ratios = watermark_text_and_calculate_matches(text, max_offset=5)
list_of_significance.append(check_significant_difference(match_ratios))
print("___________________________________________________________________________________________________________________________")
print("Done", count_t, )
print("___________________________________________________________________________________________________________________________")
```

```
___________________________________________________________________________________________________________________________
Doing 1
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.48333333333333334
Average of Other Ratios: 0.22814207650273222
T-Statistic: -21.334991021776784
P-Value: 0.00022530414214046572
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.23333333333333334
Average of Other Ratios: 0.1822033898305085
T-Statistic: -3.53275826407369
P-Value: 0.038562976693981454
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 1
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 2
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4098360655737705
Average of Other Ratios: 0.23333333333333334
T-Statistic: -4.992251154606664
P-Value: 0.015458009685690827
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2033898305084746
Average of Other Ratios: 0.1391949152542373
T-Statistic: -3.4405910948750495
P-Value: 0.04121820653114378
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 2
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 3
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.65
Average of Other Ratios: 0.34815573770491803
T-Statistic: -6.977885499593617
P-Value: 0.0060406875581721555
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.3389830508474576
Average of Other Ratios: 0.27033898305084747
T-Statistic: -2.228607614649941
P-Value: 0.11214158967770235
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 3
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 4
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.6333333333333333
Average of Other Ratios: 0.2573087431693989
T-Statistic: -17.794177111160675
P-Value: 0.0003870090924213516
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2542372881355932
Average of Other Ratios: 0.16836158192090395
T-Statistic: -2.451612903225806
P-Value: 0.2465587655124727
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 4
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 5
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4
Average of Other Ratios: 0.3151639344262295
T-Statistic: -1.713189822924711
P-Value: 0.18519433572899746
The highest ratio is not significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.288135593220339
Average of Other Ratios: 0.21087570621468926
T-Statistic: -5.467540160267347
P-Value: 0.012025943288987453
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 5
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 6
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.45901639344262296
Average of Other Ratios: 0.20416666666666666
T-Statistic: -8.101361023294555
P-Value: 0.003930735409185079
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.22033898305084745
Average of Other Ratios: 0.1307909604519774
T-Statistic: -11.145126479863883
P-Value: 0.0015479966208348658
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 6
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 7
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4166666666666667
Average of Other Ratios: 0.27424863387978143
T-Statistic: -2.647512144273123
P-Value: 0.07715790266759627
The highest ratio is not significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.22033898305084745
Average of Other Ratios: 0.1603813559322034
T-Statistic: -4.047402698396378
P-Value: 0.027156785257683596
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 7
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 8
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5333333333333333
Average of Other Ratios: 0.28599726775956286
T-Statistic: -9.817142706536112
P-Value: 0.002246603044354501
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.3
Average of Other Ratios: 0.22033898305084745
T-Statistic: -11.51260179108094
P-Value: 0.0014069485474090153
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 8
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 9
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5245901639344263
Average of Other Ratios: 0.2375
T-Statistic: -5.570367388129549
P-Value: 0.011418116056075428
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.23728813559322035
Average of Other Ratios: 0.18149717514124292
T-Statistic: -3.7964977175244834
P-Value: 0.03208088709594881
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 9
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 10
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4666666666666667
Average of Other Ratios: 0.2775273224043716
T-Statistic: -4.727075685541707
P-Value: 0.01793917650756737
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.23728813559322035
Average of Other Ratios: 0.16871468926553673
T-Statistic: -3.7922455055393622
P-Value: 0.03217383630567124
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 10
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 11
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4098360655737705
Average of Other Ratios: 0.25
T-Statistic: -3.714254520543179
P-Value: 0.033941551397426564
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.23333333333333334
Average of Other Ratios: 0.1398305084745763
T-Statistic: -4.975896705378727
P-Value: 0.015597602000975219
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 11
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 12
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.6333333333333333
Average of Other Ratios: 0.24904371584699453
T-Statistic: -14.463091326070671
P-Value: 0.0007165776065197027
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.23728813559322035
Average of Other Ratios: 0.20261299435028246
T-Statistic: -3.3817063110386885
P-Value: 0.04303714816975945
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 12
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 13
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5081967213114754
Average of Other Ratios: 0.275
T-Statistic: -5.83498532451519
P-Value: 0.01002850287511932
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.288135593220339
Average of Other Ratios: 0.18177966101694915
T-Statistic: -3.365869501933496
P-Value: 0.04354367094755919
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 13
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 14
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5245901639344263
Average of Other Ratios: 0.23750000000000002
T-Statistic: -15.536893060799459
P-Value: 0.0005793474370025991
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.22033898305084745
Average of Other Ratios: 0.1477401129943503
T-Statistic: -3.4073375272246085
P-Value: 0.04223311481214282
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 14
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 15
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.7
Average of Other Ratios: 0.32745901639344266
T-Statistic: -6.77625507348341
P-Value: 0.006568385846444286
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4067796610169492
Average of Other Ratios: 0.27838983050847455
T-Statistic: -2.9307183932115923
P-Value: 0.06096526759447833
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 15
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 16
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4426229508196721
Average of Other Ratios: 0.21666666666666667
T-Statistic: -4.119026835630454
P-Value: 0.025932160329463834
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.16666666666666666
Average of Other Ratios: 0.1228813559322034
T-Statistic: -10.333333333333332
P-Value: 0.001933293191806968
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 16
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 17
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.6721311475409836
Average of Other Ratios: 0.23333333333333334
T-Statistic: -9.613578441019637
P-Value: 0.0023886490069146135
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.3050847457627119
Average of Other Ratios: 0.21101694915254238
T-Statistic: -4.384236405710172
P-Value: 0.02197310950253267
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 17
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 18
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.38333333333333336
Average of Other Ratios: 0.2775956284153005
T-Statistic: -2.196141651943659
P-Value: 0.11558815206376069
The highest ratio is not significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2033898305084746
Average of Other Ratios: 0.14745762711864407
T-Statistic: -2.9054879908745583
P-Value: 0.062224127599699926
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 18
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 19
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.43333333333333335
Average of Other Ratios: 0.29480874316939887
T-Statistic: -3.9127157656292244
P-Value: 0.029668656491470317
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.23728813559322035
Average of Other Ratios: 0.21087570621468926
T-Statistic: -2.5909821905688375
P-Value: 0.08100515899541934
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 19
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 20
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.3770491803278688
Average of Other Ratios: 0.2125
T-Statistic: -4.70897478231848
P-Value: 0.018126865049367218
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.21666666666666667
Average of Other Ratios: 0.1440677966101695
T-Statistic: -2.967580383634676
P-Value: 0.05918282683371976
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 20
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 21
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.6166666666666667
Average of Other Ratios: 0.3523907103825137
T-Statistic: -5.470026246143757
P-Value: 0.012010754748633298
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4067796610169492
Average of Other Ratios: 0.3206920903954802
T-Statistic: -4.689814564762172
P-Value: 0.018328326818943686
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 21
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 22
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.3442622950819672
Average of Other Ratios: 0.2875
T-Statistic: -2.0013563154719005
P-Value: 0.13914298161809877
The highest ratio is not significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2542372881355932
Average of Other Ratios: 0.13940677966101694
T-Statistic: -5.09630233956434
P-Value: 0.014606958299961344
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 22
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 23
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.35
Average of Other Ratios: 0.22786885245901642
T-Statistic: -2.9204148617045544
P-Value: 0.06147547219131401
The highest ratio is not significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.23333333333333334
Average of Other Ratios: 0.11440677966101695
T-Statistic: -6.328859555783819
P-Value: 0.007975829484392977
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 23
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 24
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.43333333333333335
Average of Other Ratios: 0.2573770491803279
T-Statistic: -6.707544913199492
P-Value: 0.0067620282205820385
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.23728813559322035
Average of Other Ratios: 0.1896892655367232
T-Statistic: -2.202614379084967
P-Value: 0.1148909616099501
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 24
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 25
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.6229508196721312
Average of Other Ratios: 0.20833333333333334
T-Statistic: -7.9670319147285165
P-Value: 0.004125551420928103
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.22033898305084745
Average of Other Ratios: 0.12662429378531073
T-Statistic: -5.286516953753678
P-Value: 0.013202833807875401
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 25
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 26
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.3770491803278688
Average of Other Ratios: 0.24583333333333332
T-Statistic: -12.513573727485
P-Value: 0.0011000978118262336
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.3
Average of Other Ratios: 0.1864406779661017
T-Statistic: -2.9476070119292004
P-Value: 0.060140398566708664
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 26
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 27
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.48333333333333334
Average of Other Ratios: 0.31953551912568307
T-Statistic: -7.4329764222856545
P-Value: 0.005039460040419282
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.3898305084745763
Average of Other Ratios: 0.2657485875706215
T-Statistic: -7.979030913275075
P-Value: 0.004107637499059575
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 27
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 28
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4918032786885246
Average of Other Ratios: 0.2
T-Statistic: -10.721537070870632
P-Value: 0.0017348546540608164
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.1864406779661017
Average of Other Ratios: 0.14336158192090395
T-Statistic: -3.0162558762483083
P-Value: 0.05692805669660707
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 28
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 29
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.45
Average of Other Ratios: 0.1825136612021858
T-Statistic: -9.313256255544996
P-Value: 0.002620765891462303
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.21666666666666667
Average of Other Ratios: 0.1271186440677966
T-Statistic: -8.184904804985008
P-Value: 0.0038156709515571787
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 29
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 30
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.45
Average of Other Ratios: 0.27759562841530055
T-Statistic: -4.406020658142731
P-Value: 0.02168373286648725
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.23333333333333334
Average of Other Ratios: 0.1822033898305085
T-Statistic: -4.794807132575457
P-Value: 0.017258873277490094
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 30
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 31
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5333333333333333
Average of Other Ratios: 0.21967213114754097
T-Statistic: -6.698633516057464
P-Value: 0.006787690665758875
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2711864406779661
Average of Other Ratios: 0.13516949152542374
T-Statistic: -9.58743044198646
P-Value: 0.002407749109248065
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 31
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 32
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.45
Average of Other Ratios: 0.2444672131147541
T-Statistic: -5.639134297794538
P-Value: 0.011033909197241593
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2033898305084746
Average of Other Ratios: 0.16454802259887005
T-Statistic: -3.6666666666666683
P-Value: 0.03508151471548188
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 32
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 33
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4
Average of Other Ratios: 0.23224043715846998
T-Statistic: -9.669790958355271
P-Value: 0.0023482624188435656
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.1864406779661017
Average of Other Ratios: 0.15176553672316384
T-Statistic: -1.9438723809014464
P-Value: 0.14715294859631137
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 33
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 34
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5
Average of Other Ratios: 0.21530054644808744
T-Statistic: -4.776258392255463
P-Value: 0.017441795571057156
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2033898305084746
Average of Other Ratios: 0.15176553672316384
T-Statistic: -2.698151855052503
P-Value: 0.0739016273969179
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 34
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 35
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5737704918032787
Average of Other Ratios: 0.13749999999999998
T-Statistic: -41.60551556348084
P-Value: 3.055734072793683e-05
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
C:\Users\rrath\.conda\envs\py310\lib\site-packages\scipy\stats\_stats_py.py:1103: RuntimeWarning: divide by zero encountered in divide
var *= np.divide(n, n-ddof) # to avoid error on division by zero
C:\Users\rrath\.conda\envs\py310\lib\site-packages\scipy\stats\_stats_py.py:1103: RuntimeWarning: invalid value encountered in scalar multiply
var *= np.divide(n, n-ddof) # to avoid error on division by zero
```

```
Highest Match Ratio: 0.1016949152542373
Average of Other Ratios: 0.06666666666666667
T-Statistic: nan
P-Value: nan
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 35
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 36
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5245901639344263
Average of Other Ratios: 0.19583333333333333
T-Statistic: -11.940183637404086
P-Value: 0.001263509863921225
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.1694915254237288
Average of Other Ratios: 0.14058380414312618
T-Statistic: -2.4173228346456677
P-Value: 0.13686029083311824
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 36
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 37
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.36666666666666664
Average of Other Ratios: 0.29439890710382516
T-Statistic: -2.2491233682903635
P-Value: 0.11002736160816107
The highest ratio is not significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.3050847457627119
Average of Other Ratios: 0.1519774011299435
T-Statistic: -5.686705315838459
P-Value: 0.010777981645506028
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 37
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 38
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.55
Average of Other Ratios: 0.21598360655737703
T-Statistic: -12.468162549596142
P-Value: 0.0011119786230779946
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.25
Average of Other Ratios: 0.17372881355932202
T-Statistic: -2.3302720008113575
P-Value: 0.10212247896202177
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 38
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 39
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5409836065573771
Average of Other Ratios: 0.27083333333333337
T-Statistic: -8.587746675724997
P-Value: 0.0033191706279838665
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2711864406779661
Average of Other Ratios: 0.21949152542372882
T-Statistic: -5.019825255742886
P-Value: 0.015226316305783671
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 39
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 40
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.45
Average of Other Ratios: 0.26584699453551913
T-Statistic: -3.724628638572125
P-Value: 0.03369936057429459
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2711864406779661
Average of Other Ratios: 0.1602401129943503
T-Statistic: -3.8619097169864767
P-Value: 0.030693492269553303
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 40
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 41
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5166666666666667
Average of Other Ratios: 0.34460382513661203
T-Statistic: -3.314218356932012
P-Value: 0.0452491572848592
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.288135593220339
Average of Other Ratios: 0.23596986817325802
T-Statistic: -39.57142857142854
P-Value: 0.0006380001300463167
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 41
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 42
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5573770491803278
Average of Other Ratios: 0.20416666666666666
T-Statistic: -12.828316972577708
P-Value: 0.001022214116280989
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.1694915254237288
Average of Other Ratios: 0.12238700564971752
T-Statistic: -4.38578568651365
P-Value: 0.02195236472320489
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 42
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 43
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4166666666666667
Average of Other Ratios: 0.27759562841530055
T-Statistic: -4.08198313178942
P-Value: 0.026556436001686043
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2542372881355932
Average of Other Ratios: 0.1519774011299435
T-Statistic: -5.530747598736873
P-Value: 0.011647446932010377
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 43
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 44
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.6229508196721312
Average of Other Ratios: 0.21250000000000002
T-Statistic: -32.83606557377048
P-Value: 6.20825070326001e-05
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.21666666666666667
Average of Other Ratios: 0.15254237288135594
T-Statistic: -5.350441310978211
P-Value: 0.012770724119522098
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 44
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 45
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.43333333333333335
Average of Other Ratios: 0.24849726775956282
T-Statistic: -3.748296362699404
P-Value: 0.03315504943587395
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.22033898305084745
Average of Other Ratios: 0.14357344632768362
T-Statistic: -5.128214329323895
P-Value: 0.01435823217533278
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 45
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 46
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5245901639344263
Average of Other Ratios: 0.2625
T-Statistic: -10.53248144497122
P-Value: 0.0018279382457190715
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.23728813559322035
Average of Other Ratios: 0.1646186440677966
T-Statistic: -4.953014798968853
P-Value: 0.015795677695098962
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 46
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 47
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4426229508196721
Average of Other Ratios: 0.29166666666666663
T-Statistic: -2.5878220140515227
P-Value: 0.0812271381568774
The highest ratio is not significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.3050847457627119
Average of Other Ratios: 0.17323446327683617
T-Statistic: -5.3345252289586895
P-Value: 0.012876563862984138
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 47
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 48
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5166666666666667
Average of Other Ratios: 0.21598360655737703
T-Statistic: -8.750839688124422
P-Value: 0.003142480189931068
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.23728813559322035
Average of Other Ratios: 0.1643361581920904
T-Statistic: -3.2248357074853122
P-Value: 0.04840566051832893
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 48
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 49
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.6333333333333333
Average of Other Ratios: 0.232103825136612
T-Statistic: -8.35174130824408
P-Value: 0.0035988783002935975
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.23728813559322035
Average of Other Ratios: 0.16871468926553673
T-Statistic: -3.011507892829531
P-Value: 0.05714319479454041
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 49
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 50
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5901639344262295
Average of Other Ratios: 0.22083333333333333
T-Statistic: -13.413790344368145
P-Value: 0.0008957658359722933
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.3050847457627119
Average of Other Ratios: 0.1646186440677966
T-Statistic: -16.597491007684166
P-Value: 0.0004760985758523895
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 50
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 51
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5
Average of Other Ratios: 0.27814207650273226
T-Statistic: -3.820785157614083
P-Value: 0.03155653621002948
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.3
Average of Other Ratios: 0.16525423728813557
T-Statistic: -6.7289971752910285
P-Value: 0.0067007729656368455
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 51
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 52
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.7
Average of Other Ratios: 0.20744535519125684
T-Statistic: -28.591452014534944
P-Value: 9.39404974108921e-05
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2033898305084746
Average of Other Ratios: 0.13940677966101697
T-Statistic: -4.870967741935483
P-Value: 0.016533426116271753
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 52
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 53
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.39344262295081966
Average of Other Ratios: 0.29583333333333334
T-Statistic: -9.308639696291548
P-Value: 0.0026245624151365297
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2542372881355932
Average of Other Ratios: 0.1689265536723164
T-Statistic: -3.6720208922977697
P-Value: 0.03495083324055868
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 53
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 54
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.45
Average of Other Ratios: 0.20348360655737702
T-Statistic: -5.6752883391613915
P-Value: 0.010838689050252247
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.22033898305084745
Average of Other Ratios: 0.15586158192090396
T-Statistic: -2.738286769182844
P-Value: 0.07144110545918902
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 54
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 55
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.35
Average of Other Ratios: 0.26939890710382514
T-Statistic: -2.481709453531588
P-Value: 0.08913501383686977
The highest ratio is not significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.22033898305084745
Average of Other Ratios: 0.14350282485875704
T-Statistic: -3.588902734990965
P-Value: 0.03705188832887151
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 55
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 56
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.45
Average of Other Ratios: 0.28592896174863386
T-Statistic: -3.800322116899045
P-Value: 0.031997583361784786
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.3050847457627119
Average of Other Ratios: 0.18983050847457625
T-Statistic: -4.577628510425044
P-Value: 0.01956818745991966
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 56
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 57
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.48333333333333334
Average of Other Ratios: 0.2533469945355191
T-Statistic: -6.384490208675115
P-Value: 0.007780483735954091
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2711864406779661
Average of Other Ratios: 0.1853813559322034
T-Statistic: -3.734927184999753
P-Value: 0.033461118399696864
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 57
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 58
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4426229508196721
Average of Other Ratios: 0.3333333333333333
T-Statistic: -5.678855106783206
P-Value: 0.010819675519646264
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2542372881355932
Average of Other Ratios: 0.21346516007532956
T-Statistic: -7.919995572991999
P-Value: 0.015570889550764348
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 58
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 59
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.38333333333333336
Average of Other Ratios: 0.26536885245901637
T-Statistic: -2.8854448330676328
P-Value: 0.06324741595265697
The highest ratio is not significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.23333333333333334
Average of Other Ratios: 0.17796610169491528
T-Statistic: -3.7720217587055536
P-Value: 0.03262066446770594
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 59
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 60
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4918032786885246
Average of Other Ratios: 0.2583333333333333
T-Statistic: -6.9322595553517825
P-Value: 0.006155066763755107
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.3389830508474576
Average of Other Ratios: 0.1815677966101695
T-Statistic: -10.554502580376617
P-Value: 0.0018167602089980005
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 60
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 61
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4426229508196721
Average of Other Ratios: 0.3125
T-Statistic: -7.574268290069089
P-Value: 0.004773601555369254
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2711864406779661
Average of Other Ratios: 0.1605225988700565
T-Statistic: -5.3922713771638495
P-Value: 0.012497927330704648
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 61
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 62
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.6333333333333333
Average of Other Ratios: 0.27377049180327867
T-Statistic: -36.553205244976375
P-Value: 4.503243730199633e-05
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.288135593220339
Average of Other Ratios: 0.19837570621468928
T-Statistic: -2.935710690049308
P-Value: 0.06071996824652531
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 62
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 63
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.6
Average of Other Ratios: 0.14508196721311475
T-Statistic: -16.724450142912833
P-Value: 0.00046542921319575733
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.13559322033898305
Average of Other Ratios: 0.11789077212806026
T-Statistic: -1.9366012620612738
P-Value: 0.19241125153029964
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 63
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 64
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.6721311475409836
Average of Other Ratios: 0.2791666666666667
T-Statistic: -14.728977904018867
P-Value: 0.000678879499435165
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.3050847457627119
Average of Other Ratios: 0.19004237288135595
T-Statistic: -4.661502359215338
P-Value: 0.01863137464464403
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 64
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 65
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4166666666666667
Average of Other Ratios: 0.30293715846994534
T-Statistic: -4.392324491038346
P-Value: 0.021865089257387872
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2711864406779661
Average of Other Ratios: 0.1730225988700565
T-Statistic: -4.393369811625047
P-Value: 0.021851178700442182
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 65
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 66
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4918032786885246
Average of Other Ratios: 0.26249999999999996
T-Statistic: -2.9860360155946784
P-Value: 0.05831495178179603
The highest ratio is not significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.26666666666666666
Average of Other Ratios: 0.1822033898305085
T-Statistic: -3.3377105216719656
P-Value: 0.04446314563719635
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 66
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 67
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5666666666666667
Average of Other Ratios: 0.18237704918032788
T-Statistic: -16.083566531034208
P-Value: 0.0005227722151560427
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.22033898305084745
Average of Other Ratios: 0.13509887005649718
T-Statistic: -6.972220994378161
P-Value: 0.0060547347831298665
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 67
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 68
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.45
Average of Other Ratios: 0.24002732240437158
T-Statistic: -3.3909481332454887
P-Value: 0.04274501456307789
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.22033898305084745
Average of Other Ratios: 0.15607344632768363
T-Statistic: -4.4996604055507525
P-Value: 0.020494569188795435
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 68
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 69
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.35
Average of Other Ratios: 0.25703551912568307
T-Statistic: -2.6079704990661914
P-Value: 0.07982458937806818
The highest ratio is not significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2542372881355932
Average of Other Ratios: 0.13905367231638419
T-Statistic: -7.532784229621696
P-Value: 0.004849699633498134
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 69
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 70
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.38333333333333336
Average of Other Ratios: 0.2782103825136612
T-Statistic: -3.2919830028303934
P-Value: 0.04600941654903949
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.3050847457627119
Average of Other Ratios: 0.2234463276836158
T-Statistic: -3.191276519463654
P-Value: 0.049662470752465
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 70
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 71
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5
Average of Other Ratios: 0.27342896174863385
T-Statistic: -3.9770819524294616
P-Value: 0.02843255328181686
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.23728813559322035
Average of Other Ratios: 0.17330508474576273
T-Statistic: -2.1914765988605094
P-Value: 0.11609378562161582
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 71
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 72
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.6721311475409836
Average of Other Ratios: 0.22083333333333333
T-Statistic: -8.141189027892935
P-Value: 0.0038753168087557023
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2542372881355932
Average of Other Ratios: 0.19004237288135595
T-Statistic: -3.5786014890819224
P-Value: 0.03732340310992795
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 72
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 73
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.45901639344262296
Average of Other Ratios: 0.27499999999999997
T-Statistic: -7.102306152917264
P-Value: 0.005742682031524343
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2542372881355932
Average of Other Ratios: 0.17309322033898306
T-Statistic: -4.574785584853341
P-Value: 0.019601000667008463
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 73
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 74
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.45
Average of Other Ratios: 0.21974043715846997
T-Statistic: -4.850332934764336
P-Value: 0.016725995889598343
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.16666666666666666
Average of Other Ratios: 0.10593220338983052
T-Statistic: -2.315042178850111
P-Value: 0.10355228938029219
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 74
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 75
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4166666666666667
Average of Other Ratios: 0.307103825136612
T-Statistic: -12.422543126716244
P-Value: 0.0011240861310550566
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2711864406779661
Average of Other Ratios: 0.21101694915254235
T-Statistic: -5.377347857529729
P-Value: 0.01259437003534445
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 75
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 76
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.7049180327868853
Average of Other Ratios: 0.19583333333333333
T-Statistic: -12.238447225159446
P-Value: 0.001174764152702687
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.23728813559322035
Average of Other Ratios: 0.17704802259887006
T-Statistic: -4.366099615611315
P-Value: 0.022217857702086435
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 76
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 77
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.45901639344262296
Average of Other Ratios: 0.2416666666666667
T-Statistic: -4.10684476131458
P-Value: 0.026135349573198702
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.1864406779661017
Average of Other Ratios: 0.1573446327683616
T-Statistic: -4.756098094357983
P-Value: 0.04147677166903169
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 77
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 78
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.6666666666666666
Average of Other Ratios: 0.21591530054644809
T-Statistic: -21.638546889559418
P-Value: 0.00021600181591014574
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.22033898305084745
Average of Other Ratios: 0.17309322033898306
T-Statistic: -4.258383219097977
P-Value: 0.023746522626264043
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 78
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 79
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.39344262295081966
Average of Other Ratios: 0.25416666666666665
T-Statistic: -3.491255808134439
P-Value: 0.0397307943380083
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.23728813559322035
Average of Other Ratios: 0.14745762711864407
T-Statistic: -2.850671138558804
P-Value: 0.06507315235014592
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 79
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 80
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.38333333333333336
Average of Other Ratios: 0.24849726775956282
T-Statistic: -2.5061607287884398
P-Value: 0.08723186353716599
The highest ratio is not significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.22033898305084745
Average of Other Ratios: 0.16883239171374767
T-Statistic: -1.940875951377627
P-Value: 0.19179259810170837
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 80
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 81
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.45901639344262296
Average of Other Ratios: 0.24166666666666667
T-Statistic: -5.438466110898458
P-Value: 0.012205447523675413
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.25
Average of Other Ratios: 0.13983050847457626
T-Statistic: -13.57805716454443
P-Value: 0.0008640560542882232
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 81
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 82
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.36065573770491804
Average of Other Ratios: 0.275
T-Statistic: -1.7374154679173113
P-Value: 0.18070726370520965
The highest ratio is not significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.23728813559322035
Average of Other Ratios: 0.1687853107344633
T-Statistic: -4.943577756944967
P-Value: 0.01587832283622757
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 82
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 83
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5166666666666667
Average of Other Ratios: 0.25669398907103824
T-Statistic: -3.5751586418022976
P-Value: 0.03741471383610712
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.25
Average of Other Ratios: 0.19491525423728814
T-Statistic: -3.7527767497325675
P-Value: 0.03305327992358387
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 83
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 84
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.6721311475409836
Average of Other Ratios: 0.2791666666666667
T-Statistic: -9.089163278771835
P-Value: 0.002813782076578305
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2711864406779661
Average of Other Ratios: 0.1853813559322034
T-Statistic: -3.734927184999753
P-Value: 0.033461118399696864
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 84
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 85
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.45901639344262296
Average of Other Ratios: 0.2708333333333333
T-Statistic: -5.018214936247723
P-Value: 0.01523972222476046
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.288135593220339
Average of Other Ratios: 0.14336158192090395
T-Statistic: -9.12238026954469
P-Value: 0.002784009395450553
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 85
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 86
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.6557377049180327
Average of Other Ratios: 0.23333333333333334
T-Statistic: -19.631581158153004
P-Value: 0.00028877725744096686
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2033898305084746
Average of Other Ratios: 0.15600282485875705
T-Statistic: -4.763659834348825
P-Value: 0.017567478307811073
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 86
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 87
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.43333333333333335
Average of Other Ratios: 0.26133879781420766
T-Statistic: -3.406748909038209
P-Value: 0.04225136416729629
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.23728813559322035
Average of Other Ratios: 0.1515065913370998
T-Statistic: -4.591824862480486
P-Value: 0.044299673534495966
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 87
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 88
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.36666666666666664
Average of Other Ratios: 0.22370218579234974
T-Statistic: -3.560576108205079
P-Value: 0.03780464605083627
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.1864406779661017
Average of Other Ratios: 0.1096045197740113
T-Statistic: -9.714285714285717
P-Value: 0.002316933797952584
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 88
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 89
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.6
Average of Other Ratios: 0.29077868852459016
T-Statistic: -8.670001234457226
P-Value: 0.003228460547141703
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.288135593220339
Average of Other Ratios: 0.2490819209039548
T-Statistic: -3.390087084881917
P-Value: 0.04277212562874923
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 89
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 90
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.6721311475409836
Average of Other Ratios: 0.2791666666666667
T-Statistic: -11.945500838297065
P-Value: 0.001261851176283919
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.288135593220339
Average of Other Ratios: 0.19858757062146892
T-Statistic: -2.978265932915313
P-Value: 0.058678380877150695
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 90
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 91
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4166666666666667
Average of Other Ratios: 0.2780054644808743
T-Statistic: -7.059001645570319
P-Value: 0.005844158386152648
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2542372881355932
Average of Other Ratios: 0.2024482109227872
T-Statistic: -2.539664030967854
P-Value: 0.12632350193838743
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 91
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 92
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.6885245901639344
Average of Other Ratios: 0.275
T-Statistic: -14.961882623555587
P-Value: 0.0006479949161931086
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.3333333333333333
Average of Other Ratios: 0.23728813559322035
T-Statistic: -8.01387685344754
P-Value: 0.004056193290243036
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 92
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 93
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5081967213114754
Average of Other Ratios: 0.24583333333333335
T-Statistic: -10.170122389206956
P-Value: 0.002025718899581995
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.23728813559322035
Average of Other Ratios: 0.1434322033898305
T-Statistic: -5.819049164593993
P-Value: 0.01010570903664765
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 93
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 94
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5737704918032787
Average of Other Ratios: 0.22916666666666669
T-Statistic: -18.649446940693135
P-Value: 0.0003365083085818159
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.288135593220339
Average of Other Ratios: 0.20218926553672317
T-Statistic: -2.735771061149155
P-Value: 0.07159231928803704
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 94
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 95
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.45
Average of Other Ratios: 0.24890710382513662
T-Statistic: -17.77843472912827
P-Value: 0.0003880303178033284
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.23728813559322035
Average of Other Ratios: 0.13919491525423727
T-Statistic: -4.655912421566584
P-Value: 0.018691975738082102
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 95
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 96
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5573770491803278
Average of Other Ratios: 0.2
T-Statistic: -11.19804462246161
P-Value: 0.001526562128286031
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2033898305084746
Average of Other Ratios: 0.14766949152542372
T-Statistic: -3.1966382474552573
P-Value: 0.04945892607281697
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 96
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 97
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4
Average of Other Ratios: 0.21154371584699455
T-Statistic: -6.028584041580653
P-Value: 0.009149467578500266
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.25
Average of Other Ratios: 0.13559322033898305
T-Statistic: -16.53405576378645
P-Value: 0.00048155159575766156
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 97
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 98
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.39344262295081966
Average of Other Ratios: 0.2375
T-Statistic: -5.077417805154715
P-Value: 0.014756796916773422
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.1864406779661017
Average of Other Ratios: 0.13509887005649718
T-Statistic: -2.456740106111629
P-Value: 0.09113124582704853
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 98
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 99
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.43333333333333335
Average of Other Ratios: 0.2695355191256831
T-Statistic: -4.505004550324794
P-Value: 0.02042928010369098
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2033898305084746
Average of Other Ratios: 0.15635593220338984
T-Statistic: -2.448992878796489
P-Value: 0.09176170732285532
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 99
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 100
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.48333333333333334
Average of Other Ratios: 0.30327868852459017
T-Statistic: -5.139479033055133
P-Value: 0.014271753686067527
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2711864406779661
Average of Other Ratios: 0.17295197740112994
T-Statistic: -12.697505573117574
P-Value: 0.0010536643393062766
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 100
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 101
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5666666666666667
Average of Other Ratios: 0.23162568306010928
T-Statistic: -5.174508393880651
P-Value: 0.01400713962990326
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.23728813559322035
Average of Other Ratios: 0.18545197740112995
T-Statistic: -2.597095416447633
P-Value: 0.08057786815687772
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 101
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 102
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.3770491803278688
Average of Other Ratios: 0.2333333333333333
T-Statistic: -2.6611003960675528
P-Value: 0.07626699967069282
The highest ratio is not significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2711864406779661
Average of Other Ratios: 0.1602401129943503
T-Statistic: -7.0060661223464
P-Value: 0.00597143776668511
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 102
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 103
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4918032786885246
Average of Other Ratios: 0.2375
T-Statistic: -8.083990107136325
P-Value: 0.003955234173311845
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.22033898305084745
Average of Other Ratios: 0.16871468926553673
T-Statistic: -3.3950093870826388
P-Value: 0.042617435426627326
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 103
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 104
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5409836065573771
Average of Other Ratios: 0.16666666666666669
T-Statistic: -20.79300879817042
P-Value: 0.00024328442858722197
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2033898305084746
Average of Other Ratios: 0.1478813559322034
T-Statistic: -2.899678131794266
P-Value: 0.06251860149004074
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 104
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 105
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5737704918032787
Average of Other Ratios: 0.22916666666666663
T-Statistic: -7.437123752141218
P-Value: 0.0050313817019966775
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.13559322033898305
Average of Other Ratios: 0.10131826741996235
T-Statistic: -1.9782608695652164
P-Value: 0.1864941692932114
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 105
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 106
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5901639344262295
Average of Other Ratios: 0.23750000000000002
T-Statistic: -20.5280562633211
P-Value: 0.00025277231465770014
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2033898305084746
Average of Other Ratios: 0.15628531073446328
T-Statistic: -3.1594347385098827
P-Value: 0.05089331398223453
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 106
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 107
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.6721311475409836
Average of Other Ratios: 0.35
T-Statistic: -8.241430969943103
P-Value: 0.00374033200083373
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.3559322033898305
Average of Other Ratios: 0.29964689265536726
T-Statistic: -10.350649350649334
P-Value: 0.001923817806020011
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 107
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 108
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.6065573770491803
Average of Other Ratios: 0.1958333333333333
T-Statistic: -10.95264116575592
P-Value: 0.0016294169815328763
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2033898305084746
Average of Other Ratios: 0.14766949152542372
T-Statistic: -5.257547050264664
P-Value: 0.013404952501338205
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 108
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 109
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.36065573770491804
Average of Other Ratios: 0.26666666666666666
T-Statistic: -2.270928029445486
P-Value: 0.10783275809661891
The highest ratio is not significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.22033898305084745
Average of Other Ratios: 0.15190677966101696
T-Statistic: -2.742051411140234
P-Value: 0.07121556090757529
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 109
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 110
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5081967213114754
Average of Other Ratios: 0.2416666666666667
T-Statistic: -11.07944631333403
P-Value: 0.0015751594215650242
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.3333333333333333
Average of Other Ratios: 0.17796610169491528
T-Statistic: -7.701540462154052
P-Value: 0.004549748975956458
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 110
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 111
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.3770491803278688
Average of Other Ratios: 0.25
T-Statistic: -2.8812045893326337
P-Value: 0.063466587684043
The highest ratio is not significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.23728813559322035
Average of Other Ratios: 0.10557909604519773
T-Statistic: -7.476466358952792
P-Value: 0.004955591570430506
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 111
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 112
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.48333333333333334
Average of Other Ratios: 0.19904371584699454
T-Statistic: -12.288393956261835
P-Value: 0.0011607197809402983
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2033898305084746
Average of Other Ratios: 0.11843220338983051
T-Statistic: -3.149160708078649
P-Value: 0.05129865697051939
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 112
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 113
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5333333333333333
Average of Other Ratios: 0.23633879781420766
T-Statistic: -4.584804480637148
P-Value: 0.019485678388664683
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.23728813559322035
Average of Other Ratios: 0.1602401129943503
T-Statistic: -2.951009970239908
P-Value: 0.05997588776618918
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 113
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 114
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4
Average of Other Ratios: 0.24881602914389797
T-Statistic: -3.563314918926205
P-Value: 0.07052714913781105
The highest ratio is not significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2033898305084746
Average of Other Ratios: 0.18559322033898307
T-Statistic: -2.848958479370646
P-Value: 0.06516476187287569
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 114
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 115
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.39344262295081966
Average of Other Ratios: 0.19583333333333333
T-Statistic: -4.628326083672306
P-Value: 0.018994819352054024
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.1694915254237288
Average of Other Ratios: 0.11798493408662901
T-Statistic: -5.251610061723054
P-Value: 0.034398946176199485
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 115
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 116
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5
Average of Other Ratios: 0.298155737704918
T-Statistic: -4.033530534554061
P-Value: 0.027402529576258054
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4
Average of Other Ratios: 0.24152542372881358
T-Statistic: -10.949598818482546
P-Value: 0.001630748987105004
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 116
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 117
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.48333333333333334
Average of Other Ratios: 0.2693306010928962
T-Statistic: -6.652844379359568
P-Value: 0.006921592104201834
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2542372881355932
Average of Other Ratios: 0.15593220338983052
T-Statistic: -6.457745685519285
P-Value: 0.007532728000207892
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 117
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 118
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4
Average of Other Ratios: 0.22404371584699456
T-Statistic: -12.321363422263417
P-Value: 0.0011515711663196136
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.1864406779661017
Average of Other Ratios: 0.13926553672316386
T-Statistic: -5.731425162505742
P-Value: 0.010544436415387572
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 118
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 119
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5
Average of Other Ratios: 0.2528688524590164
T-Statistic: -9.302810429704943
P-Value: 0.002629366644977586
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.1864406779661017
Average of Other Ratios: 0.16299435028248588
T-Statistic: -3.608695652173914
P-Value: 0.06894253641177729
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 119
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 120
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.55
Average of Other Ratios: 0.2778688524590164
T-Statistic: -4.536961611771408
P-Value: 0.020044437314205348
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.3220338983050847
Average of Other Ratios: 0.1518361581920904
T-Statistic: -9.402305491422489
P-Value: 0.0025489364534890947
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 120
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 121
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5833333333333334
Average of Other Ratios: 0.29453551912568304
T-Statistic: -13.407344687092644
P-Value: 0.0008970413965513573
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.25
Average of Other Ratios: 0.211864406779661
T-Statistic: -7.794228634059958
P-Value: 0.004395375691816533
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 121
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 122
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.8360655737704918
Average of Other Ratios: 0.22083333333333333
T-Statistic: -33.295454627318755
P-Value: 5.9553511839070765e-05
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.26666666666666666
Average of Other Ratios: 0.24152542372881355
T-Statistic: -3.098582276011423
P-Value: 0.05335457237433866
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 122
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 123
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.6
Average of Other Ratios: 0.257172131147541
T-Statistic: -22.28671843401477
P-Value: 0.0001977851067158647
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.3
Average of Other Ratios: 0.24152542372881355
T-Statistic: -3.1118145559317116
P-Value: 0.052806824094954664
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 123
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 124
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.6721311475409836
Average of Other Ratios: 0.14166666666666666
T-Statistic: -49.30752240433902
P-Value: 1.836912647385703e-05
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.23333333333333334
Average of Other Ratios: 0.17372881355932202
T-Statistic: -2.8133333333333344
P-Value: 0.0671073341823401
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 124
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 125
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.7868852459016393
Average of Other Ratios: 0.19166666666666665
T-Statistic: -18.866186551774142
P-Value: 0.00032511960811709917
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.288135593220339
Average of Other Ratios: 0.17274011299435027
T-Statistic: -4.913402497837348
P-Value: 0.016146391435118687
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 125
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 126
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.7049180327868853
Average of Other Ratios: 0.17500000000000002
T-Statistic: -13.259465772580212
P-Value: 0.0009269807118292367
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.1694915254237288
Average of Other Ratios: 0.13926553672316386
T-Statistic: -3.6722304933420005
P-Value: 0.03494573014722615
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 126
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 127
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.6333333333333333
Average of Other Ratios: 0.2614071038251366
T-Statistic: -11.56614172997057
P-Value: 0.00138784136278078
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2711864406779661
Average of Other Ratios: 0.21497175141242938
T-Statistic: -2.7808379940637797
P-Value: 0.06894254556283926
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 127
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 128
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4918032786885246
Average of Other Ratios: 0.26666666666666666
T-Statistic: -3.4878013959288654
P-Value: 0.039830015747165624
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2542372881355932
Average of Other Ratios: 0.21115819209039546
T-Statistic: -2.8047829882173874
P-Value: 0.06758426686914822
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 128
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 129
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.48333333333333334
Average of Other Ratios: 0.22418032786885247
T-Statistic: -17.367164105589854
P-Value: 0.0004160299549552091
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.23333333333333334
Average of Other Ratios: 0.19067796610169493
T-Statistic: -5.2571452098620695
P-Value: 0.01340778439483409
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 129
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 130
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.36666666666666664
Average of Other Ratios: 0.23654371584699457
T-Statistic: -7.528860163609132
P-Value: 0.0048569805574603395
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2033898305084746
Average of Other Ratios: 0.11822033898305086
T-Statistic: -6.986639340848923
P-Value: 0.0060190635058901916
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 130
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 131
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5081967213114754
Average of Other Ratios: 0.22916666666666666
T-Statistic: -7.321262735084583
P-Value: 0.005263650485292702
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2033898305084746
Average of Other Ratios: 0.16045197740112996
T-Statistic: -3.771816669089535
P-Value: 0.03262523637568915
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 131
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 132
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5737704918032787
Average of Other Ratios: 0.25
T-Statistic: -9.922022842122464
P-Value: 0.0021777725182429894
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2033898305084746
Average of Other Ratios: 0.1605225988700565
T-Statistic: -2.828645932579055
P-Value: 0.0662636402779568
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 132
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 133
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.6885245901639344
Average of Other Ratios: 0.14166666666666666
T-Statistic: -27.567153326380936
P-Value: 0.00010477109294231701
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.15254237288135594
Average of Other Ratios: 0.10103578154425613
T-Statistic: -3.0973237391013955
P-Value: 0.0903361791294822
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 133
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 134
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.7704918032786885
Average of Other Ratios: 0.19999999999999998
T-Statistic: -17.482855438271482
P-Value: 0.00040788898005848117
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.3050847457627119
Average of Other Ratios: 0.1434322033898305
T-Statistic: -9.210615078582297
P-Value: 0.0027069229317899495
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 134
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 135
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.43333333333333335
Average of Other Ratios: 0.2941939890710382
T-Statistic: -4.012426942160979
P-Value: 0.027781897561995554
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.288135593220339
Average of Other Ratios: 0.1896186440677966
T-Statistic: -4.894736842105265
P-Value: 0.016315166384743927
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 135
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 136
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.8524590163934426
Average of Other Ratios: 0.275
T-Statistic: -22.287637793898487
P-Value: 0.00019776074953457593
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.3559322033898305
Average of Other Ratios: 0.23213276836158192
T-Statistic: -5.049357722281286
P-Value: 0.014983160961383876
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 136
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 137
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4098360655737705
Average of Other Ratios: 0.3
T-Statistic: -3.441600870206092
P-Value: 0.04118787585017441
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2542372881355932
Average of Other Ratios: 0.2236581920903955
T-Statistic: -3.6741121386261355
P-Value: 0.03489996095058197
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 137
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 138
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.48333333333333334
Average of Other Ratios: 0.3110655737704918
T-Statistic: -4.021031471649375
P-Value: 0.027626407165013953
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.3050847457627119
Average of Other Ratios: 0.2531779661016949
T-Statistic: -2.8305893057056326
P-Value: 0.06615751198285083
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 138
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 139
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.43333333333333335
Average of Other Ratios: 0.28572404371584703
T-Statistic: -3.0768255794880175
P-Value: 0.054270837524734016
The highest ratio is not significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2542372881355932
Average of Other Ratios: 0.23587570621468926
T-Statistic: -2.137186834969644
P-Value: 0.16604961117054062
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 139
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 140
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.55
Average of Other Ratios: 0.2030737704918033
T-Statistic: -11.28407580595368
P-Value: 0.0014925480755074861
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.18333333333333332
Average of Other Ratios: 0.11864406779661017
T-Statistic: -3.5335467141319046
P-Value: 0.038541217308166585
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 140
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 141
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.3770491803278688
Average of Other Ratios: 0.26666666666666666
T-Statistic: -2.4183597074545036
P-Value: 0.09430740171744867
The highest ratio is not significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2033898305084746
Average of Other Ratios: 0.12662429378531073
T-Statistic: -5.193353369212653
P-Value: 0.013867428943147485
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 141
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 142
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.36666666666666664
Average of Other Ratios: 0.24453551912568308
T-Statistic: -3.448799623718753
P-Value: 0.0409724672224204
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2542372881355932
Average of Other Ratios: 0.1434322033898305
T-Statistic: -6.313436023720237
P-Value: 0.00803112776427947
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 142
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 143
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.36666666666666664
Average of Other Ratios: 0.21550546448087432
T-Statistic: -2.656953690777749
P-Value: 0.07653752004862353
The highest ratio is not significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2542372881355932
Average of Other Ratios: 0.15600282485875705
T-Statistic: -3.541825936051609
P-Value: 0.03831367266160501
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 143
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 144
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.639344262295082
Average of Other Ratios: 0.25416666666666665
T-Statistic: -36.732973581097305
P-Value: 4.437567678781254e-05
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.3050847457627119
Average of Other Ratios: 0.16440677966101697
T-Statistic: -8.34181386665146
P-Value: 0.00361131585696273
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 144
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 145
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4426229508196721
Average of Other Ratios: 0.2583333333333333
T-Statistic: -9.290054918133373
P-Value: 0.002639919880417069
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2033898305084746
Average of Other Ratios: 0.1348870056497175
T-Statistic: -6.050693757052515
P-Value: 0.009055606464007803
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 145
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 146
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.639344262295082
Average of Other Ratios: 0.23333333333333334
T-Statistic: -9.555035305421034
P-Value: 0.0024316941386992286
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.22033898305084745
Average of Other Ratios: 0.18135593220338983
T-Statistic: -5.276561879022918
P-Value: 0.01327183724912819
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 146
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 147
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.45901639344262296
Average of Other Ratios: 0.25
T-Statistic: -5.347493377444
P-Value: 0.012790241471607905
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.26666666666666666
Average of Other Ratios: 0.17796610169491525
T-Statistic: -5.028024029479735
P-Value: 0.015158299248004988
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 147
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 148
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.55
Average of Other Ratios: 0.2739071038251366
T-Statistic: -24.893769141544713
P-Value: 0.00014212863367693852
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.26666666666666666
Average of Other Ratios: 0.1822033898305085
T-Statistic: -7.92070349524896
P-Value: 0.004195693967370819
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 148
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 149
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.35
Average of Other Ratios: 0.26994535519125684
T-Statistic: -3.13501995090005
P-Value: 0.051863239303626886
The highest ratio is not significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.23728813559322035
Average of Other Ratios: 0.18545197740112995
T-Statistic: -2.9798032437751925
P-Value: 0.05860625147640999
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 149
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 150
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.7213114754098361
Average of Other Ratios: 0.2875
T-Statistic: -13.79033606511491
P-Value: 0.0008252499042687573
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.3728813559322034
Average of Other Ratios: 0.26958568738229755
T-Statistic: -2.957002218815846
P-Value: 0.09786519767422906
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 150
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 151
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5409836065573771
Average of Other Ratios: 0.26666666666666666
T-Statistic: -3.469865323288583
P-Value: 0.04035029930545199
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.23333333333333334
Average of Other Ratios: 0.1652542372881356
T-Statistic: -3.6229338549736347
P-Value: 0.03617265345530896
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 151
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 152
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5
Average of Other Ratios: 0.2864071038251366
T-Statistic: -9.166199952585503
P-Value: 0.00274536805342564
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2711864406779661
Average of Other Ratios: 0.22796610169491527
T-Statistic: -2.846542418148333
P-Value: 0.06529426992156556
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 152
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 153
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.45
Average of Other Ratios: 0.27363387978142073
T-Statistic: -4.238851608241817
P-Value: 0.02403813124545163
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2711864406779661
Average of Other Ratios: 0.14774011299435028
T-Statistic: -4.041966945913288
P-Value: 0.02725273967177257
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 153
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 154
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.6721311475409836
Average of Other Ratios: 0.23750000000000002
T-Statistic: -12.680742877830454
P-Value: 0.001057786978105176
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2542372881355932
Average of Other Ratios: 0.19406779661016949
T-Statistic: -3.4332517325533063
P-Value: 0.041439515376910444
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 154
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 155
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5573770491803278
Average of Other Ratios: 0.21250000000000002
T-Statistic: -5.6891750976260695
P-Value: 0.010764907709515388
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2711864406779661
Average of Other Ratios: 0.14738700564971752
T-Statistic: -4.874543567127261
P-Value: 0.016500349284977168
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 155
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 156
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5901639344262295
Average of Other Ratios: 0.22916666666666669
T-Statistic: -5.917936455036488
P-Value: 0.009638872740618382
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.23333333333333334
Average of Other Ratios: 0.211864406779661
T-Statistic: -4.387862045841163
P-Value: 0.021924602127579518
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 156
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 157
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5409836065573771
Average of Other Ratios: 0.27499999999999997
T-Statistic: -7.078346628927314
P-Value: 0.005798537274283227
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.23728813559322035
Average of Other Ratios: 0.1796610169491525
T-Statistic: -5.666666666666667
P-Value: 0.029758752589905717
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 157
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 158
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5
Average of Other Ratios: 0.2573087431693989
T-Statistic: -12.130613891363168
P-Value: 0.0012058641450002766
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.23728813559322035
Average of Other Ratios: 0.19830508474576272
T-Statistic: -3.682947537517003
P-Value: 0.034686070852458215
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 158
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 159
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.38333333333333336
Average of Other Ratios: 0.25314207650273224
T-Statistic: -3.7198098893146967
P-Value: 0.03381158141645187
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.22033898305084745
Average of Other Ratios: 0.15160075329566855
T-Statistic: -4.166330062408052
P-Value: 0.05306537932277536
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 159
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 160
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.36666666666666664
Average of Other Ratios: 0.29453551912568304
T-Statistic: -4.000535767249803
P-Value: 0.02799863936714751
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.23728813559322035
Average of Other Ratios: 0.19392655367231637
T-Statistic: -2.224345699469886
P-Value: 0.11258691448891019
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 160
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 161
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.45
Average of Other Ratios: 0.1867486338797814
T-Statistic: -18.377401251596062
P-Value: 0.00035156723441787536
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.1694915254237288
Average of Other Ratios: 0.1056497175141243
T-Statistic: -4.336541993961348
P-Value: 0.022624336731357778
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 161
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 162
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5737704918032787
Average of Other Ratios: 0.2833333333333333
T-Statistic: -12.870123433871873
P-Value: 0.0010124261840382058
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.23728813559322035
Average of Other Ratios: 0.1797551789077213
T-Statistic: -3.8907727779580643
P-Value: 0.060159036553398035
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 162
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 163
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4426229508196721
Average of Other Ratios: 0.2625
T-Statistic: -4.652549903587426
P-Value: 0.018728552464152025
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2542372881355932
Average of Other Ratios: 0.18968926553672316
T-Statistic: -2.986928104575163
P-Value: 0.058273407134112075
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 163
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 164
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4666666666666667
Average of Other Ratios: 0.24863387978142074
T-Statistic: -5.629498300891231
P-Value: 0.01108671624446389
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.22033898305084745
Average of Other Ratios: 0.14048964218455745
T-Statistic: -3.229591652487888
P-Value: 0.08397470085153524
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 164
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 165
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.47540983606557374
Average of Other Ratios: 0.22916666666666669
T-Statistic: -4.253993084342693
P-Value: 0.023811667807213974
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.1694915254237288
Average of Other Ratios: 0.1391949152542373
T-Statistic: -3.8838243353571547
P-Value: 0.03024595201311427
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 165
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 166
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4918032786885246
Average of Other Ratios: 0.22916666666666666
T-Statistic: -10.55444127076214
P-Value: 0.0018167912040410702
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2542372881355932
Average of Other Ratios: 0.16885593220338982
T-Statistic: -3.6939328104988762
P-Value: 0.03442246894212467
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 166
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 167
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.45
Average of Other Ratios: 0.28237704918032785
T-Statistic: -9.511627906976749
P-Value: 0.0024642759907036603
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2542372881355932
Average of Other Ratios: 0.20268361581920902
T-Statistic: -3.2230287355249403
P-Value: 0.04847230125929893
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 167
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 168
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5573770491803278
Average of Other Ratios: 0.29166666666666663
T-Statistic: -55.22686591346667
P-Value: 1.307697160598242e-05
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Token indices sequence length is longer than the specified maximum sequence length for this model (518 > 512). Running this sequence through the model will result in indexing errors
```

```
Error processing chunk 'cheese, my digestion, why hast thou not served thyself in setting my table so many ? Come, whatâ€™s Agamemnon? THERSITES. my commander, Achilles. Then tell me Patroclus, whatâ€™s Achilles? PATROCLUS. my lord, Thersites. Then tell me I pray thee, whatâ€™s not THERSITES. Thy knower, Patroclus. then tell me, Patroclus, what art thou? PATROCLUS. Thou must know that knowest. ACHILLES. O, pray tell, THERSITES. Iâ€™ll decline the whole question. Agamemnon commands . Achilles is my lord; i am Patroclusâ€™ knower; and achilles is a fool. PATROCLUS. o rascal! THERSITES. Peace, fool! i have not done. ACHILLES. achilles is a privilegâ€™d man. o Thersites. THERSITES. Agamemnon is a fool; Achilles is a fool Thersites is a fool; and as aforesaid, Patroclus is a fool. ACHILLES. Derive this; o THERSITES. Agamemnon is a fool to offer to command . Achilles is a fool to be commanded of Agamemnon; achilles is a fool to be such a fool; and yet Patroclus is a fool . PATROCLUS. Why am I a fool? THERSITES. Make that name of the Creator. It tells me thou art. Look , who comes here? Enter : Ulysses, Nestor, Diomedes, Ajax , Calchas. ACHILLES. Come, Patroclus, and speak with nobody. Come speak with me, Thersites. [_Exit_.] but Here is such patchery, such juggling, and such knavery. and the argument is a fool and a cuckoldâ€”a good way to draw emulous factions to bleed to death upon. take the dry serpigo on the subject, and war and war confound all! [_Exit_.] AGAMEMNON. where is Achilles? PATROCLUS. Within the tent; but ill-disposâ€™d, my lord AGAMEMNON. Let it be known to him that we are here. He shent our men and we lay by our appertainings, visiting of him. let him be told so; [MASK] perchance, he think We': The size of tensor a (518) must match the size of tensor b (512) at non-singleton dimension 1
Error processing chunk 'cheese, my digestion, why hast thou not served thyself in setting my table so many ? Come, whatâ€™s Agamemnon? THERSITES. my commander, Achilles. Then tell me Patroclus, whatâ€™s Achilles? PATROCLUS. my lord, Thersites. Then tell me I pray thee, whatâ€™s not THERSITES. Thy knower, Patroclus. then tell me, Patroclus, what art thou? PATROCLUS. Thou must know that knowest. ACHILLES. O, pray tell, THERSITES. Iâ€™ll decline the whole question. Agamemnon commands . Achilles is my lord; i am Patroclusâ€™ knower; and achilles is a fool. PATROCLUS. o rascal! THERSITES. Peace, fool! i have not done. ACHILLES. achilles is a privilegâ€™d man. o Thersites. THERSITES. Agamemnon is a fool; Achilles is a fool Thersites is a fool; and as aforesaid, Patroclus is a fool. ACHILLES. Derive this; o THERSITES. Agamemnon is a fool to offer to command . Achilles is a fool to be commanded of Agamemnon; achilles is a fool to be such a fool; and yet Patroclus is a fool . PATROCLUS. Why am I a fool? THERSITES. Make that name of the Creator. It tells me thou art. Look , who comes here? Enter : Ulysses, Nestor, Diomedes, Ajax , Calchas. ACHILLES. Come, Patroclus, and speak with nobody. Come speak with me, Thersites. [_Exit_.] but Here is such patchery, such juggling, and such knavery. and the argument is a fool and a cuckoldâ€”a good way to draw emulous factions to bleed to death upon. take the dry serpigo on the subject, and war and war confound all! [_Exit_.] AGAMEMNON. where is Achilles? PATROCLUS. Within the tent; but ill-disposâ€™d, my lord AGAMEMNON. Let it be known to him that we are here. He shent our men and we lay by our appertainings, visiting of him. let him be told so; [MASK] perchance, he think We [MASK] not move the question': The size of tensor a (523) must match the size of tensor b (512) at non-singleton dimension 1
Error processing chunk 'cheese, my digestion, why hast thou not served thyself in setting my table so many ? Come, whatâ€™s Agamemnon? THERSITES. my commander, Achilles. Then tell me Patroclus, whatâ€™s Achilles? PATROCLUS. my lord, Thersites. Then tell me I pray thee, whatâ€™s not THERSITES. Thy knower, Patroclus. then tell me, Patroclus, what art thou? PATROCLUS. Thou must know that knowest. ACHILLES. O, pray tell, THERSITES. Iâ€™ll decline the whole question. Agamemnon commands . Achilles is my lord; i am Patroclusâ€™ knower; and achilles is a fool. PATROCLUS. o rascal! THERSITES. Peace, fool! i have not done. ACHILLES. achilles is a privilegâ€™d man. o Thersites. THERSITES. Agamemnon is a fool; Achilles is a fool Thersites is a fool; and as aforesaid, Patroclus is a fool. ACHILLES. Derive this; o THERSITES. Agamemnon is a fool to offer to command . Achilles is a fool to be commanded of Agamemnon; achilles is a fool to be such a fool; and yet Patroclus is a fool . PATROCLUS. Why am I a fool? THERSITES. Make that name of the Creator. It tells me thou art. Look , who comes here? Enter : Ulysses, Nestor, Diomedes, Ajax , Calchas. ACHILLES. Come, Patroclus, and speak with nobody. Come speak with me, Thersites. [_Exit_.] but Here is such patchery, such juggling, and such knavery. and the argument is a fool and a cuckoldâ€”a good way to draw emulous factions to bleed to death upon. take the dry serpigo on the subject, and war and war confound all! [_Exit_.] AGAMEMNON. where is Achilles? PATROCLUS. Within the tent; but ill-disposâ€™d, my lord AGAMEMNON. Let it be known to him that we are here. He shent our men and we lay by our appertainings, visiting of him. let him be told so; [MASK] perchance, he think We [MASK] not move the question': The size of tensor a (523) must match the size of tensor b (512) at non-singleton dimension 1
Error processing chunk 'cheese, my digestion, why hast thou not brought thyself in to my house so many meals? Come, o Agamemnon? THERSITES. Thy commander, and Then tell me, Patroclus, about Achilles? PATROCLUS. Thy lord, and Then tell me, I ask thee, whatâ€™s Thersites? THERSITES. thy knower, Patroclus. Then tell , Patroclus, what art thou? and Thou must tell that , ACHILLES. O, tell, tell, or Iâ€™ll decline the whole . Agamemnon commands Achilles; Achilles , my lord; I am thy knower; and Patroclus is thy fool. PATROCLUS. You rascal! make Peace, fool! I have already done. ACHILLES. He is a privilegâ€™d man. Proceed, Thersites. but Agamemnon is a fool; achilles is a fool; Thersites is a fool; and, as always Patroclus is a fool. i Derive this; come. THERSITES. achilles is a fool to attempt to command Achilles; Achilles is a fool to be afraid of Agamemnon; Thersites is a fool to serve such a fool; and this Patroclus is a fool positive. PATROCLUS. why am I a fool? i Make that demand of my Creator. It suffices me to art. Look you, who is here? Enter Agamemnon, Ulysses, and Diomedes, Ajax and Calchas. now Come, Patroclus, Iâ€™ll speak to nobody. Come in with the Thersites. [_Exit_.] THERSITES. Here is such patchery, such juggling, and such knavery. All the world is a whore and a cuckoldâ€”a good quarrel to make emulous factions and bleed to death upon. Now the great serpigo on the subject, and war and lechery confound . [_Exit_.] AGAMEMNON. Where is this PATROCLUS. Within his tent; his ill-disposâ€™d, my lord. AGAMEMNON. let it be known to all that we are here. thou shent our messengers; and we lay by Our appertainings, speak of him. Let him be told so; lest, perchance, i think We dare not [MASK] the question': The size of tensor a (515) must match the size of tensor b (512) at non-singleton dimension 1
Error processing chunk 'cheese, my digestion, why hast thou not brought thyself in to my house so many meals? Come, o Agamemnon? THERSITES. Thy commander, and Then tell me, Patroclus, about Achilles? PATROCLUS. Thy lord, and Then tell me, I ask thee, whatâ€™s Thersites? THERSITES. thy knower, Patroclus. Then tell , Patroclus, what art thou? and Thou must tell that , ACHILLES. O, tell, tell, or Iâ€™ll decline the whole . Agamemnon commands Achilles; Achilles , my lord; I am thy knower; and Patroclus is thy fool. PATROCLUS. You rascal! make Peace, fool! I have already done. ACHILLES. He is a privilegâ€™d man. Proceed, Thersites. but Agamemnon is a fool; achilles is a fool; Thersites is a fool; and, as always Patroclus is a fool. i Derive this; come. THERSITES. achilles is a fool to attempt to command Achilles; Achilles is a fool to be afraid of Agamemnon; Thersites is a fool to serve such a fool; and this Patroclus is a fool positive. PATROCLUS. why am I a fool? i Make that demand of my Creator. It suffices me to art. Look you, who is here? Enter Agamemnon, Ulysses, and Diomedes, Ajax and Calchas. now Come, Patroclus, Iâ€™ll speak to nobody. Come in with the Thersites. [_Exit_.] THERSITES. Here is such patchery, such juggling, and such knavery. All the world is a whore and a cuckoldâ€”a good quarrel to make emulous factions and bleed to death upon. Now the great serpigo on the subject, and war and lechery confound . [_Exit_.] AGAMEMNON. Where is this PATROCLUS. Within his tent; his ill-disposâ€™d, my lord. AGAMEMNON. let it be known to all that we are here. thou shent our messengers; and we lay by Our appertainings, speak of him. Let him be told so; lest, perchance, i think We dare not [MASK] the question': The size of tensor a (515) must match the size of tensor b (512) at non-singleton dimension 1
Highest Match Ratio: 0.2542372881355932
Average of Other Ratios: 0.22201351933310776
T-Statistic: -3.1365710180077406
P-Value: 0.0518009301559042
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 168
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 169
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5081967213114754
Average of Other Ratios: 0.23333333333333334
T-Statistic: -5.497267759562843
P-Value: 0.011845956731503078
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.26666666666666666
Average of Other Ratios: 0.21610169491525424
T-Statistic: -3.493722261155749
P-Value: 0.0396601427679115
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 169
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 170
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.43333333333333335
Average of Other Ratios: 0.2941256830601093
T-Statistic: -2.1752046582440823
P-Value: 0.1178783573581168
The highest ratio is not significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.23728813559322035
Average of Other Ratios: 0.1910546139359699
T-Statistic: -7.418137270026101
P-Value: 0.017691506692045566
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 170
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 171
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4098360655737705
Average of Other Ratios: 0.325
T-Statistic: -3.3934426229508206
P-Value: 0.04266659593484531
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.3050847457627119
Average of Other Ratios: 0.23192090395480225
T-Statistic: -3.3773352617852765
P-Value: 0.043176200293171145
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 171
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 172
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5573770491803278
Average of Other Ratios: 0.18333333333333335
T-Statistic: -10.579553917424954
P-Value: 0.0018041539811479287
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2033898305084746
Average of Other Ratios: 0.1432909604519774
T-Statistic: -1.8928833055825962
P-Value: 0.15471358909740393
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 172
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 173
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4
Average of Other Ratios: 0.20737704918032787
T-Statistic: -4.581868223019867
P-Value: 0.019519384720467742
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2033898305084746
Average of Other Ratios: 0.1290018832391714
T-Statistic: -3.8230779561170367
P-Value: 0.06211218967841154
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 173
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 174
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5
Average of Other Ratios: 0.31489071038251365
T-Statistic: -3.002786519391192
P-Value: 0.05754103974524036
The highest ratio is not significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.3050847457627119
Average of Other Ratios: 0.2193502824858757
T-Statistic: -4.75599598618577
P-Value: 0.017644508859181555
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 174
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 175
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.31666666666666665
Average of Other Ratios: 0.23674863387978146
T-Statistic: -3.2609722394321854
P-Value: 0.04709695971232935
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2033898305084746
Average of Other Ratios: 0.1601694915254237
T-Statistic: -2.0803333919424123
P-Value: 0.12896148504661395
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 175
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 176
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.48333333333333334
Average of Other Ratios: 0.2693306010928962
T-Statistic: -5.216676982893284
P-Value: 0.013697022421412816
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.288135593220339
Average of Other Ratios: 0.1901129943502825
T-Statistic: -3.182414988821109
P-Value: 0.05000120219940348
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 176
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 177
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5833333333333334
Average of Other Ratios: 0.30703551912568305
T-Statistic: -26.318257342843076
P-Value: 0.00012035017922562646
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.3389830508474576
Average of Other Ratios: 0.2531779661016949
T-Statistic: -6.182185493474629
P-Value: 0.008522718517249426
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 177
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 178
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.6
Average of Other Ratios: 0.27363387978142073
T-Statistic: -11.501316579943088
P-Value: 0.0014110204730729838
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.3559322033898305
Average of Other Ratios: 0.2658898305084746
T-Statistic: -4.202979499690264
P-Value: 0.024585862477064943
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 178
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 179
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4918032786885246
Average of Other Ratios: 0.28750000000000003
T-Statistic: -6.210497654675213
P-Value: 0.00841340269771356
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.288135593220339
Average of Other Ratios: 0.2192090395480226
T-Statistic: -3.4729797480883895
P-Value: 0.040259338911251656
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 179
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 180
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5901639344262295
Average of Other Ratios: 0.20833333333333334
T-Statistic: -23.92854336374633
P-Value: 0.00015995510518182317
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.3050847457627119
Average of Other Ratios: 0.19399717514124296
T-Statistic: -7.840633887955942
P-Value: 0.004320681154632681
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 180
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 181
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.36065573770491804
Average of Other Ratios: 0.3
T-Statistic: -2.5734050069412073
P-Value: 0.08224938731599696
The highest ratio is not significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.288135593220339
Average of Other Ratios: 0.19413841807909604
T-Statistic: -5.2860346002987235
P-Value: 0.013206166418066021
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 181
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 182
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.47540983606557374
Average of Other Ratios: 0.2833333333333333
T-Statistic: -3.914708631165495
P-Value: 0.029629359927274702
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2711864406779661
Average of Other Ratios: 0.2078154425612053
T-Statistic: -2.134529747722321
P-Value: 0.16636576065135147
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 182
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 183
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5081967213114754
Average of Other Ratios: 0.3
T-Statistic: -5.782581278907251
P-Value: 0.010285339228476416
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2711864406779661
Average of Other Ratios: 0.22372881355932203
T-Statistic: -3.59486813709167
P-Value: 0.036895807895617604
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 183
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 184
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4666666666666667
Average of Other Ratios: 0.2489754098360656
T-Statistic: -9.227144083318263
P-Value: 0.002692797704338326
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.3220338983050847
Average of Other Ratios: 0.12245762711864407
T-Statistic: -7.84890994931086
P-Value: 0.004307536410571643
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 184
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 185
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5901639344262295
Average of Other Ratios: 0.275
T-Statistic: -4.488369320415114
P-Value: 0.02063340350086202
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2833333333333333
Average of Other Ratios: 0.21610169491525422
T-Statistic: -2.401102376173316
P-Value: 0.09577949808406833
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 185
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 186
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5409836065573771
Average of Other Ratios: 0.2791666666666667
T-Statistic: -7.354405200045387
P-Value: 0.00519578998775744
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2542372881355932
Average of Other Ratios: 0.20254237288135596
T-Statistic: -7.415534221028932
P-Value: 0.005073622594306211
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 186
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 187
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4098360655737705
Average of Other Ratios: 0.2708333333333333
T-Statistic: -9.766999910411144
P-Value: 0.002280530521151582
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2711864406779661
Average of Other Ratios: 0.1772598870056497
T-Statistic: -6.333333333333335
P-Value: 0.007959883216421762
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 187
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 188
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.3333333333333333
Average of Other Ratios: 0.22659380692167577
T-Statistic: -9.085817324099237
P-Value: 0.011897809590563803
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.18333333333333332
Average of Other Ratios: 0.15677966101694915
T-Statistic: -6.26666666666667
P-Value: 0.00820192086968827
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 188
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 189
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5573770491803278
Average of Other Ratios: 0.25416666666666665
T-Statistic: -6.766299230685365
P-Value: 0.0065959876919741
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2542372881355932
Average of Other Ratios: 0.19413841807909604
T-Statistic: -5.382088936904238
P-Value: 0.012563625893657768
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 189
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 190
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.7213114754098361
Average of Other Ratios: 0.27499999999999997
T-Statistic: -8.885189155139434
P-Value: 0.0030061660806281146
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.2711864406779661
Average of Other Ratios: 0.23545197740112994
T-Statistic: -1.9022556390977454
P-Value: 0.30811702486531156
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 190
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 191
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.48333333333333334
Average of Other Ratios: 0.34446721311475414
T-Statistic: -3.3256460363533833
P-Value: 0.04486462240497135
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.35
Average of Other Ratios: 0.2754237288135593
T-Statistic: -9.191300234460838
P-Value: 0.0027235534322141448
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 191
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 192
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.65
Average of Other Ratios: 0.26584699453551913
T-Statistic: -15.779833617982495
P-Value: 0.0005532481835581896
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.26666666666666666
Average of Other Ratios: 0.20338983050847456
T-Statistic: -2.536300556483895
P-Value: 0.08495405146875473
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 192
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 193
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5666666666666667
Average of Other Ratios: 0.2067622950819672
T-Statistic: -6.043879310010205
P-Value: 0.009084400202355151
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.23728813559322035
Average of Other Ratios: 0.1307909604519774
T-Statistic: -10.043517801177494
P-Value: 0.002101488368509634
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 193
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 194
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.6229508196721312
Average of Other Ratios: 0.225
T-Statistic: -9.957417838496047
P-Value: 0.0021551750735026
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.22033898305084745
Average of Other Ratios: 0.14766949152542375
T-Statistic: -6.856800905858473
P-Value: 0.006350573689746284
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 194
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 195
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4426229508196721
Average of Other Ratios: 0.2583333333333333
T-Statistic: -3.3466253328190545
P-Value: 0.04416940727951125
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.288135593220339
Average of Other Ratios: 0.16871468926553673
T-Statistic: -7.1482687865653824
P-Value: 0.005637501684561561
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 195
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 196
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.45
Average of Other Ratios: 0.2692622950819672
T-Statistic: -4.2061085388157045
P-Value: 0.02453744768447861
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.23333333333333334
Average of Other Ratios: 0.1483050847457627
T-Statistic: -2.948006683383839
P-Value: 0.06012104778921462
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 196
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 197
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.45
Average of Other Ratios: 0.23654371584699452
T-Statistic: -14.867978061029115
P-Value: 0.0006602177614744704
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.26666666666666666
Average of Other Ratios: 0.14830508474576273
T-Statistic: -4.1037036556074025
P-Value: 0.02618807381282589
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 197
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 198
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.6229508196721312
Average of Other Ratios: 0.1875
T-Statistic: -34.83606557377048
P-Value: 5.20111686902031e-05
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.22033898305084745
Average of Other Ratios: 0.16885593220338985
T-Statistic: -6.912825719494809
P-Value: 0.0062046516135269335
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 198
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 199
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5833333333333334
Average of Other Ratios: 0.20710382513661202
T-Statistic: -10.283901901516982
P-Value: 0.001960684970337362
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.21666666666666667
Average of Other Ratios: 0.1694915254237288
T-Statistic: -3.936227748605116
P-Value: 0.02920926328962092
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 199
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 200
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4666666666666667
Average of Other Ratios: 0.28592896174863386
T-Statistic: -4.828799478606252
P-Value: 0.01693007546409189
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.288135593220339
Average of Other Ratios: 0.19858757062146892
T-Statistic: -3.923159163483533
P-Value: 0.029463468310785026
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 200
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 201
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.7213114754098361
Average of Other Ratios: 0.3416666666666667
T-Statistic: -7.851608784383327
P-Value: 0.004303261311903815
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4745762711864407
Average of Other Ratios: 0.3416666666666667
T-Statistic: -7.374780272477125
P-Value: 0.0051546431499948815
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 201
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 202
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.6065573770491803
Average of Other Ratios: 0.37916666666666665
T-Statistic: -6.507337920439277
P-Value: 0.007370864736714179
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4406779661016949
Average of Other Ratios: 0.346045197740113
T-Statistic: -6.302829818170098
P-Value: 0.008069446993235066
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 202
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 203
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.6666666666666666
Average of Other Ratios: 0.489275956284153
T-Statistic: -6.06668077104039
P-Value: 0.00898852350687363
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5423728813559322
Average of Other Ratios: 0.38834745762711864
T-Statistic: -7.130307147698573
P-Value: 0.005678300191929151
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 203
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 204
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5833333333333334
Average of Other Ratios: 0.373155737704918
T-Statistic: -5.4116120970679615
P-Value: 0.01237436958129694
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4576271186440678
Average of Other Ratios: 0.36292372881355933
T-Statistic: -3.837369345431411
P-Value: 0.031204798309479358
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 204
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 205
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4918032786885246
Average of Other Ratios: 0.3875
T-Statistic: -3.170655272045821
P-Value: 0.050455235210516786
The highest ratio is not significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4745762711864407
Average of Other Ratios: 0.37542372881355934
T-Statistic: -7.414573731136387
P-Value: 0.005075512686812505
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 205
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 206
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5901639344262295
Average of Other Ratios: 0.4125
T-Statistic: -5.932056727240316
P-Value: 0.0095745359301512
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.43333333333333335
Average of Other Ratios: 0.3686440677966102
T-Statistic: -4.46962233410428
P-Value: 0.020866618588713408
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 206
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 207
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.6065573770491803
Average of Other Ratios: 0.39999999999999997
T-Statistic: -4.741062246648036
P-Value: 0.017795875960800792
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4666666666666667
Average of Other Ratios: 0.3601694915254237
T-Statistic: -7.358286550031938
P-Value: 0.005187918406325382
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 207
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 208
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.6557377049180327
Average of Other Ratios: 0.39583333333333337
T-Statistic: -6.819438116767888
P-Value: 0.006450408520510395
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4745762711864407
Average of Other Ratios: 0.3504237288135593
T-Statistic: -6.6711622996447275
P-Value: 0.006867611409633449
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 208
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 209
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.6166666666666667
Average of Other Ratios: 0.41427595628415304
T-Statistic: -3.5454803247028934
P-Value: 0.03821378114440935
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.3898305084745763
Average of Other Ratios: 0.3543785310734463
T-Statistic: -3.127498225142409
P-Value: 0.05216674452503871
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 209
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 210
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5737704918032787
Average of Other Ratios: 0.37083333333333335
T-Statistic: -10.982657455064276
P-Value: 0.0016163522474045274
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.3898305084745763
Average of Other Ratios: 0.3331920903954802
T-Statistic: -1.7611959878594572
P-Value: 0.1764230443495778
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 210
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 211
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5833333333333334
Average of Other Ratios: 0.4810109289617486
T-Statistic: -2.397936955599478
P-Value: 0.09605254927181417
The highest ratio is not significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5254237288135594
Average of Other Ratios: 0.4387005649717514
T-Statistic: -3.5958734600175015
P-Value: 0.03686958706248058
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 211
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 212
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.6557377049180327
Average of Other Ratios: 0.39166666666666666
T-Statistic: -4.380390416901878
P-Value: 0.022024716850259
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4576271186440678
Average of Other Ratios: 0.3257062146892655
T-Statistic: -6.924029541793238
P-Value: 0.02022773333106733
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 212
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 213
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5245901639344263
Average of Other Ratios: 0.36249999999999993
T-Statistic: -5.412060981830556
P-Value: 0.012371520890864776
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4576271186440678
Average of Other Ratios: 0.3334745762711865
T-Statistic: -4.989086378646126
P-Value: 0.015484895041677766
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 213
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 214
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5666666666666667
Average of Other Ratios: 0.4185792349726776
T-Statistic: -3.1318929933163084
P-Value: 0.05198914240331166
The highest ratio is not significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4406779661016949
Average of Other Ratios: 0.29555084745762716
T-Statistic: -4.560758399398884
P-Value: 0.019763952369741263
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 214
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 215
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.6
Average of Other Ratios: 0.3980191256830601
T-Statistic: -6.2294612621620695
P-Value: 0.008341210430613992
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4745762711864407
Average of Other Ratios: 0.37570621468926557
T-Statistic: -3.448652473575024
P-Value: 0.04097685603288608
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 215
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 216
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.55
Average of Other Ratios: 0.43142076502732246
T-Statistic: -11.855239270966623
P-Value: 0.00129040506090979
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4406779661016949
Average of Other Ratios: 0.388135593220339
T-Statistic: -4.65582342119235
P-Value: 0.018692942668843485
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 216
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 217
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5166666666666667
Average of Other Ratios: 0.4312158469945355
T-Statistic: -2.769971026007625
P-Value: 0.06957017387261798
The highest ratio is not significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4406779661016949
Average of Other Ratios: 0.37125706214689264
T-Statistic: -4.6417007257391365
P-Value: 0.018847202898111523
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 217
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 218
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5666666666666667
Average of Other Ratios: 0.3818306010928962
T-Statistic: -4.847117449604144
P-Value: 0.016756266049925205
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4666666666666667
Average of Other Ratios: 0.3008474576271187
T-Statistic: -3.3764082011477075
P-Value: 0.04320576532803089
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 218
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 219
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4666666666666667
Average of Other Ratios: 0.4064890710382514
T-Statistic: -1.6875505812729021
P-Value: 0.19008194720260654
The highest ratio is not significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4745762711864407
Average of Other Ratios: 0.3799435028248587
T-Statistic: -2.299434047232845
P-Value: 0.10504286102353275
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 219
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 220
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5573770491803278
Average of Other Ratios: 0.4
T-Statistic: -8.742170122442447
P-Value: 0.003151555065586373
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4745762711864407
Average of Other Ratios: 0.3674435028248587
T-Statistic: -3.575513903469711
P-Value: 0.03740527845126834
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 220
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 221
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.6
Average of Other Ratios: 0.47274590163934427
T-Statistic: -3.8037548196566555
P-Value: 0.03192304772258143
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4745762711864407
Average of Other Ratios: 0.4132768361581921
T-Statistic: -2.677380542667445
P-Value: 0.07521629738306508
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 221
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 222
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.6666666666666666
Average of Other Ratios: 0.3689207650273224
T-Statistic: -6.200033744114341
P-Value: 0.00845358911672797
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5166666666666667
Average of Other Ratios: 0.3559322033898305
T-Statistic: -8.779860612843027
P-Value: 0.0031123529183814187
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 222
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 223
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.6229508196721312
Average of Other Ratios: 0.425
T-Statistic: -12.405173285892113
P-Value: 0.0011287421498101565
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4406779661016949
Average of Other Ratios: 0.38210922787193974
T-Statistic: -2.7837850452128095
P-Value: 0.10845055325071953
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 223
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 224
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5833333333333334
Average of Other Ratios: 0.3605874316939891
T-Statistic: -5.648887116271631
P-Value: 0.010980796736842062
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4915254237288136
Average of Other Ratios: 0.35812146892655367
T-Statistic: -2.5160267825812963
P-Value: 0.08647807452830744
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 224
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 225
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5833333333333334
Average of Other Ratios: 0.39364754098360655
T-Statistic: -4.293991625472901
P-Value: 0.023226546554816672
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4166666666666667
Average of Other Ratios: 0.3601694915254237
T-Statistic: -3.233808333817773
P-Value: 0.0480764627944047
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 225
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 226
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5833333333333334
Average of Other Ratios: 0.40614754098360656
T-Statistic: -4.422690436146169
P-Value: 0.02146562039496988
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4406779661016949
Average of Other Ratios: 0.36694915254237287
T-Statistic: -2.474962294633946
P-Value: 0.08966910755373639
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 226
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 227
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.6166666666666667
Average of Other Ratios: 0.38148907103825136
T-Statistic: -8.30926126344939
P-Value: 0.0036525000991545153
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4745762711864407
Average of Other Ratios: 0.3294491525423729
T-Statistic: -5.302012165253797
P-Value: 0.013096358284056045
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 227
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 228
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.6065573770491803
Average of Other Ratios: 0.38749999999999996
T-Statistic: -3.4641024571780417
P-Value: 0.04051930159431608
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4406779661016949
Average of Other Ratios: 0.3333333333333333
T-Statistic: -4.82600482600724
P-Value: 0.016956798300543117
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 228
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 229
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.6166666666666667
Average of Other Ratios: 0.3691939890710383
T-Statistic: -6.170366319105602
P-Value: 0.008568905214827621
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4666666666666667
Average of Other Ratios: 0.3177966101694915
T-Statistic: -7.026666666666668
P-Value: 0.0059214753537602605
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 229
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 230
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5573770491803278
Average of Other Ratios: 0.4041666666666667
T-Statistic: -3.05362313626129
P-Value: 0.05526992091860083
The highest ratio is not significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5084745762711864
Average of Other Ratios: 0.379590395480226
T-Statistic: -4.111518045317934
P-Value: 0.02605716031917639
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 230
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 231
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.6833333333333333
Average of Other Ratios: 0.42711748633879776
T-Statistic: -7.960605911691153
P-Value: 0.004135187577765976
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4576271186440678
Average of Other Ratios: 0.41007532956685494
T-Statistic: -4.605203601134296
P-Value: 0.04405939035816047
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 231
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 232
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.6666666666666666
Average of Other Ratios: 0.45163934426229513
T-Statistic: -3.2153935344715583
P-Value: 0.04875516994578165
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5084745762711864
Average of Other Ratios: 0.40127118644067794
T-Statistic: -2.4156666462852487
P-Value: 0.09453530050793599
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 232
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 233
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.55
Average of Other Ratios: 0.43995901639344265
T-Statistic: -3.5143217545081513
P-Value: 0.03907627630046548
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4666666666666667
Average of Other Ratios: 0.3686440677966102
T-Statistic: -3.612819691689752
P-Value: 0.036431155449573864
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 233
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 234
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.6666666666666666
Average of Other Ratios: 0.3816256830601093
T-Statistic: -11.100144808896685
P-Value: 0.0015665315397856033
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5254237288135594
Average of Other Ratios: 0.36299435028248583
T-Statistic: -4.890297438969469
P-Value: 0.016355645150575224
The highest ratio is significantly different from the others.
___________________________________________________________________________________________________________________________
Done 234
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 235
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.5333333333333333
Average of Other Ratios: 0.4523224043715847
T-Statistic: -2.5642179367732165
P-Value: 0.082909074165789
The highest ratio is not significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4406779661016949
Average of Other Ratios: 0.3626412429378531
T-Statistic: -2.3461567951496813
P-Value: 0.1006566092112587
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 235
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 236
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.55
Average of Other Ratios: 0.4187158469945355
T-Statistic: -3.611089663775107
P-Value: 0.03647560832552664
The highest ratio is significantly different from the others.
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
```

```
Highest Match Ratio: 0.4915254237288136
Average of Other Ratios: 0.41035781544256117
T-Statistic: -3.6256912791125826
P-Value: 0.06836231728156876
The highest ratio is not significantly different from the others.
___________________________________________________________________________________________________________________________
Done 236
___________________________________________________________________________________________________________________________
___________________________________________________________________________________________________________________________
Doing 237
___________________________________________________________________________________________________________________________
```

```
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be ex
```